I want to write a scraper with python that crawl some urls and scrape and save datas. I know how can I write it as a simple program. I'm looking for a way to deploy it on my virtual server (running ubuntu) as a service to make it non-stop crawling. Could any one tell me How can I do this?
            Asked
            
        
        
            Active
            
        
            Viewed 1,583 times
        
    6
            
            
         
    
    
        DisappointedByUnaccountableMod
        
- 6,656
- 4
- 18
- 22
 
    
    
        Bardia Heydari
        
- 777
- 9
- 24
- 
                    So which part do you need help on? Letting it run for a long time? Writing a crawler? Using python with Linux? – Sleep Deprived Bulbasaur Aug 14 '14 at 13:24
- 
                    Letting it run for a long time :) @SleepDeprivedBulbasaur – Bardia Heydari Aug 14 '14 at 13:25
- 
                    Use scrapy to build your own crawler. Don't ever let the queue of URLs that it's scraping go empty. http://doc.scrapy.org/en/latest/topics/spiders.html – FrobberOfBits Aug 14 '14 at 13:34
- 
                    How about `while True: scrape()`? That will run for a long time. – Kevin Aug 14 '14 at 13:34
- 
                    1@BardiaHeydari you want to look into daemonizing it. – Sleep Deprived Bulbasaur Aug 14 '14 at 13:38
1 Answers
5
            What you want to do is daemonize the process. This will be helpful in creating a daemon.
Daemonizing the process will allow it to run in background mode, so as long as the server is running (even if a user isn't logged in) it will continue running.
Here is an example daemon that writes the time to a file.
import daemon
import time
def do_something():
    while True:
        with open("/tmp/current_time.txt", "w") as f:
            f.write("The time is now " + time.ctime())
        time.sleep(5)
def run():
    with daemon.DaemonContext():
        do_something()
if __name__ == "__main__":
    run()
 
    
    
        Community
        
- 1
- 1
 
    
    
        Sleep Deprived Bulbasaur
        
- 2,368
- 4
- 21
- 33