How to get images of Discogs releases?

Question

I want to get images of Discogs releases. Can I do it without Discogs API? They don't have links to the images in their db dumps.

VC.One · Answer 1 · 2016-11-19T03:44:01.663

To do this without the API, you would have to load a web page and extract the image from the html source code. You can find the relevant page by loading https://www.discogs.com/release/xxxx where xxxx is the release number. Since html is just a text file, you can now extract the jpeg URL.

I don't know what your programming language is, but I'm sure it can handle String functions, like indexOf and subString. You could extract the html's OG:Image content for picture.

So taking an example: https://www.discogs.com/release/8140515

Find the .indexOf("og:image\" content=\"); save as startPos to some integer.
That's 19 chars so next do a .indexOf(".jpg", startPos + 19); into a endPos.
This gets the first occurence of .jpg after index of startPos + 19 any other chars.
Now extract a subString from html text img_URL = myHtmlStr.substring(startPos+19, endPos);
You should end up with a string reading like this below (extracted URL):
https://img.discogs.com/_zHBK73yJ5oON197YTDXM7JoBjA=/fit-in/600x600/filters:strip_icc():format(jpeg):mode_rgb():quality(90)/discogs-images/R-8140515-1460073064-5890.jpeg.jpg
The process can be shortened to finding the startPos index of https://img., then find first occurrence of .jpg when searching from after that startPos index. Extract within that length range. This is because the image URL is only mentioned in the html source at https://img.

Compare page at : https://www.discogs.com/release/8140515 with extracted URL image below.

**note :** You might have to fine-tune those index Pos numbers. eg: You might change from **+19** to **+21** in order to cut off the quotation marks etc (**if needed** by your coding tool). You'll figure it out when testing... — VC.One, Feb 20 '16 at 04:21
Trying to fetch images of many releases, won't Discogs block automatic access? — Collector, Feb 20 '16 at 10:25
@Collector, I don't think so (unless you can show otherwise). Access was not blocked for any of my testing AS3 code or PHP code. Each loaded 5 images just to check paths are parsed correctly. — VC.One, Feb 21 '16 at 16:25
Okay. The question was to get images without API. I believe I showed a good / correct answer for that. As for 5000 pics, that's a new detail. I'm not a server expert. I can only suggest you pace it out to fly under the radar, cos I can imagine 5000 requests from same IP address **at once** will look suspicious & be IP blocked. An "all day, everyday" site-user could access 5000 images spread over a week & wont be blocked so y'know... pace it out. — VC.One, Feb 23 '16 at 00:14

alexandre-rousseau · Answer 2 · 2018-03-30T07:56:54.737

This is how to do it with Java & Jsoup library.

get HTML page of the release
parse HTML & get <meta property="og:image" content=".." /> to get content value

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class DiscogRelease {

    private final String url;

    public DiscogRelease(String url) {
        this.url = url;
    }

    public String getImageUrl() {
        try {
            Document doc = Jsoup.connect(this.url).get();
            Elements metas = doc.head().select("meta[property=\"og:image\"]");
            if (!metas.isEmpty()) {
                Element element = metas.get(0);
                return element.attr("content");
            }
        } catch (IOException ex) {
            Logger.getLogger(DiscogRelease.class.getName()).log(Level.SEVERE, null, ex);
        }
        return null;
    }

}

How to get images of Discogs releases?

2 Answers2