colly is a web scraping framework written in Go. Import it as https://github.com/gocolly/colly. You will typically use this tag together with the main tag [go].
Questions tagged [go-colly]
63 questions
                    
                    4
                    
            votes
                
                1 answer
            
        Go Colly not returning any data from website
I am trying to make a simple web scraper in go and I can't seem to get the most simple functionality from colly. I took the basic example from the colly docs and while it worked with the hackernews.org site they used it isn't working with the site I…
        
        Cade
        
- 89
 - 1
 - 8
 
                    3
                    
            votes
                
                1 answer
            
        add colly package output text to map in golang
i was making a web scraper with colly package, where it collects the ContestName and ContestTime from a website and make a json file.
so i did like this
    Contests := make(map[string]map[string]map[string]map[string]string)
    
   …
        
        Vinay Kumar Rasala
        
- 89
 - 6
 
                    3
                    
            votes
                
                1 answer
            
        Get values from same class name values in colly web scraping
i am working on small web scraping application  using go language and colly web scraping framework which is built in Go
here is the html code of website
  
    
        
        
            
            
                
                    
    
    
        
    
    
                
            
        
    
                    
                
                    
        Dinesh s
        
- 313
 - 4
 - 19
 
                    3
                    
            votes
                
                0 answers
            
        Passing cookies from Go Rod (Headless browser) to requests, Colly cookiejar
I am trying to pass cookies from a headless browser in golang to the requests package cookiejar. There are some JS generated cookies that I need to grab using the headless broswer and then pass to the requests module.
I currently have this to export…
        
        AntBox
        
- 31
 - 1
 
                    3
                    
            votes
                
                1 answer
            
        How to use selectors properly
I'm writing a crawler to retrieve some data from some pages, the logic of how to build it is very clear for me but I am very confused in how to use the selectors properly.
I would like to get the title of some news using colly, I went to the page…
        
        MrByte
        
- 97
 - 1
 - 10
 
                    2
                    
            votes
                
                1 answer
            
        how to ignore printing Max depth limit reached go colly
i have a go colly crawler that i am trying to crawl many sites . on my terminal it prints a lot of :
2023/05/30 02:22:56 Max depth limit reached
2023/05/30 02:22:56 Max depth limit reached
2023/05/30 02:22:56 Max depth limit reached
2023/05/30…
        
        Farshad
        
- 1,830
 - 6
 - 38
 - 70
 
                    2
                    
            votes
                
                1 answer
            
        Scraping all possible tags and putting them into one variable using Go Colly
I need to scrape different tags from a list of sites, put in variable and then put them in a .csv list. For example, all lines where the author of the article is mentioned (div.author, p.author etc). On all sites, the location of this line and the…
        
        Maxim Zhukotsky
        
- 23
 - 3
 
                    2
                    
            votes
                
                1 answer
            
        Max Rate limit of StackOverflow
I have been trying to access StackOverflow with the amount of 30 requests / second but it not working. It has been blocked after a few seconds. Although the document of StackOverflow said the max rate limit of StackExchange is 30 req /s.
The…
        
        Hiếu Nguyễn Trung
        
- 21
 - 3
 
                    2
                    
            votes
                
                1 answer
            
        Web scrapping using Golang Colly, How to handle XML path not found?
I am using Colly for scrapping an ecommerce website. I will loop over many products.
Here is a snippet of my code getting a sub-title
    c.OnXML("/html/body/div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1/1234", func(e *colly.XMLElement) {
  …
        
        Chau Loi
        
- 1,106
 - 1
 - 14
 - 36
 
                    2
                    
            votes
                
                1 answer
            
        Go Colly how to find requested element?
I'm trying to get specific table to loop through its content using colly but table its not being recognized, here's what I have so far.
package main
import (
    "fmt"
    
    "github.com/gocolly/colly"
)
func main() {
    c :=…
        
        Lynx
        
- 105
 - 9
 
                    2
                    
            votes
                
                1 answer
            
        How do I scrape TLS certificates using go-colly?
I am using Colly to scrape a website and I am trying to also get the TLS certificate that the site is presenting during the TLS handshake. I looked through the documentation and the response object but did not find what I was looking for.
According…
        
        user234980238402
        
- 21
 - 1
 
                    2
                    
            votes
                
                1 answer
            
        Go Colly parallelism decreases the number of links scraped
I am trying to build a web scrapper to scrape jobs from internshala.com. I am using go colly to build the web scrapper. I visit every page and then visit the subsequent links of each job to scrape data from. Doing this in a sequential manner scrapes…
        
        Adnan
        
- 88
 - 1
 - 7
 
                    2
                    
            votes
                
                0 answers
            
        Web scraping site using polymerjs / webcomponent
I'm using colly to web scrape youtube charts. This site use polymerjs and as a result, I'm having issues to capture the DOM elements. A simple test I did was document.querySelector("#search-native") on console, and it's returning null.
I saw an…
        
        Jess
        
- 53
 - 1
 - 1
 - 5
 
                    2
                    
            votes
                
                1 answer
            
        What can the go-colly library do?
Can the go-colly library crawl all HTML tags and text content under a div tag? If so, how? I can get all texts under a div tag. Like this:
c.OnHTML("body .post-topic-main .post-topic-des", func(e *colly.HTMLElement) {
            text =…
        
        N Fx
        
- 41
 - 3
 
                    2
                    
            votes
                
                1 answer
            
        Parsing nested elements using go-colly scraper
I'm using go-colly to scrape data from a webpage:
I'm unable to parse out the src image from this nested HTML element.
    c.OnHTML(".result-row", func(e *colly.HTMLElement) {
        qoquerySelection := e.DOM
       …
        
        Ryan
        
- 1,102
 - 1
 - 15
 - 30