By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,337 Members | 2,084 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,337 IT Pros & Developers. It's quick & easy.

getting <generator object <genexpr> at 0x1193417d8> as output

P: 3
Expand|Select|Wrap|Line Numbers
  1. #Reading files with txt extension
  2. def get_sentences():
  3.     for root, dirs, files in os.walk("/Users/Documents/test1"):
  4.         for file in files:
  5.             if file.endswith(".txt"):
  6.                 x_ = codecs.open(os.path.join(root,file),"r", "utf-8-sig")
  7.                 for lines in x_.readlines():
  8.                     yield lines
  9. formoreprocessing = get_sentences()
  10.  
  11. #Tokenizing sentences of the text files
  12.  
  13. from nltk.tokenize import sent_tokenize
  14. for i in formoreprocessing:
  15.     raw_docs = sent_tokenize(i)
  16.     tokenized_docs = [sent_tokenize(i) for sent in raw_docs]
  17.  
  18. '''Removing Stop Words'''
  19. stopword_removed_sentences = []
  20. from nltk.corpus import stopwords
  21. stopset = set(stopwords.words("English"))
  22. def strip_stopwords(sentence):
  23.     return ' '.join(word for word in sentence.split() if word not in stopset)
  24. stopword_removed_sentences = (strip_stopwords(sentence) for sentence in raw_docs)
  25. print(stopword_removed_sentences)
  26.  
The above code is not printing what it is supposed to be. Instead it is throwing:<generator object <genexpr> at 0x1193417d8> as output. What is the mistake here? I am using python 3.5.
Jun 19 '16 #1
Share this Question
Share on Google+
1 Reply


Expert 100+
P: 621
Cross posted to http://stackoverflow.com/questions/3...17d8-as-output which contains an answer.
Jun 20 '16 #2

Post your reply

Sign in to post your reply or Sign up for a free account.