Project Title: Company Information Extraction from Enfsolar

Project Description:


We need someone to scrape company information.

This is an ongoing project.

What pages to crawl?

Please crawl pages from below company lists:

Please crawl all the field from these pages into one big excel file, and NO DUPLICATE ONES PLEASE (If one company exists in different categories, for example, Sunergy LLC appears both in inverter and charge_controller category, then you need to merge all fields together and keep only one unique record for each company.)

Please also save the company profile images into a folder and put the image name such as “xxx.jpg” into the excel column fields.

Please also crawl the LinkedIn overview section if the company profile page has a LinkedIn icon.

For example:

If you click the social media LinkedIn icon:
it goes to: https://www.linkedin.com/company/xiamen-corigy-new-energy-technology-co–ltd/

Then Go to about page:
Please Crawl the overview section there and save this field into the big excel file as well, leave this column heading as “aboutus”.

What fields to crawl?
>For each company, only crawl the necessary fields and name the field followed the names in the attached image, plus the optional LinkedIn Overview section. If there are any extra fields not described in the image but in the actual pages, please ask me first and then give it any name you would like.

>Save scraped information into one big excel files. NO DUPLICATE COMPANY PROFILES PLEASE.

For similar work requirements feel free to email us on info@logicwis.com.