Web scraping or web data extraction?

By Jacob Laurvigen on September 20th, 2016




If you are a developer you might have noticed a change in wording surrounding what would normally be described as web scraping. This is now called web research or as we call it, web extraction. So why don’t we just call it web scraping? Depending on geographic location the understanding of web scraping and if it’s a good or bad thing ranges from “Web Scraping is a natural tool for data research” to “this is a gray zone”.

 

What web scraping really is

What web scraping really is, is to gather data from the web and the purpose for this could be just as different as why people read books, newspapers or any other source of knowledge. But what you can’t do is of course copying a text, image etc. and publish/present/re-sell this as your own work. Just like a journalist reads other news papers you would need to write your own version or simply use this newly acquired knowledge to navigate. What it really is, is applying robotics to a manual job as you already would be doing when you start your computer and open a browser to look for information. What the dexi.io web scraping robots (or web extraction robots as we call it) do is to automate a string of events to save you or your organisation of e.g. writing all this information down manually or pressing F5 all day long to look for changes. So web scraping is helping you automate a manual and time-consuming jobs and yes and that IS a good thing!

 

Read more…