I am having a dataframe of which one column has a list of strings at each row.
On average, each list has 150 words of about 6 characters each.
Each of the 700 rows of the dataframe is about a document and each string is a word of this document; so basically I have tokenised the words of the document.
I want to detect the language of each of these documents and to do this I firstly try to detect the language of each word of the document.
For this reason I do the following:
from textblob import TextBlob
def lang_detect(document):
    lang_count = {}
    for word in document:
        if len(word) >= 4:
            word_textblob = TextBlob(word)
            lang_result = word_textblob.detect_language()
            response = lang_count.get(lang_result)
            if response is None:  
                lang_count[f"{lang_result}"] = 1
            else:
                lang_count[f"{lang_result}"] += 1
    return lang_count
df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))
When I do this then I get the following error:
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-42-772df3809bcb> in <module>
     25 
---> 27 df_per_doc['languages_count'] = df_per_doc['complete_text'].apply(lambda x: lang_detect(x))
     28 
     29 
.
.
.
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 429: Too Many Requests
The error is much longer and I have omitted the rest of it at the middle.
Now,I am getting the same error even if I try to do this for only two documents/rows.
Is there any way that I can get a response from textblob for more words & documents?
 
     
     
    