Project Title: Extract Specific Information from Goodreads
I am looking for a way to extract specific information from Goodreads.com.
Every book page on Goodreads has a Q&A page, e.g: https://www.goodreads.com/book/2.Harry_Potter_and_the_Order_of_the_Phoenix/questions
I want the timestamp for the first question posted on this page, along with its answer timestamp.
As you can see, the timestamp appears in a relative way (“6 years ago”).
I wanted to get a sense if there is any way to extract this information more precisely.
One way is by using the Internet Archive: (see for example http://web.archive.org/web/20160505114338/https://www.goodreads.com/book/2.Harry_Potter_and_the_Order_of_the_Phoenix/questions).
If I have ~100,000 book titles, I want to get an estimate for the feasibility and cost of combining goodreads + internet archive, and getting month level precision on timestamp information for the first question and its answer.
For similar work requirement feel free to email us on firstname.lastname@example.org.