473,463 Members | 1,494 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Negation detection in text

3
Hi every one,

I'm working on negation detection for sentiment analysis in text files. Most common negation words are not, no and never. for example "good" is classified as positive but "not good" should be classified as negative.

So far I can only detect negations where the adjective is exactly right after the negation word ("not good"). But I need to consider different scopes. I need to detect the negation effect till the end of the sentence. i.e. if the system finds "not" in sentence, I want it to consider all the adjectives till the "." mark. For example: "this movie was not a very good one." Here although "good" is not right after "not" but since is before the end of the sentence, it will be classified as negative.

There are three input arguments: 1- a file containing a list of negative words like "bad" and "boring" 2- a file containing a list of positive words like "good" and "interesting" 3- a directory which includes text files which are movie reviews.

My problems is that I can not extend the scope of my negation. How can I check next 5 words after negation word or how can I find the adjectives till the end of the sentence and after the negation word.



Here is my code of what I have done so far. I also tried few "While()" and "for()" loops for my purpose but I could not manage to find the correct one.

I hope it is clear. Please don't hesitate to ask any further questions.

regards,
Maral


Expand|Select|Wrap|Line Numbers
  1. import java.io.*;
  2. import java.util.*;
  3. import java.util.regex.*; 
  4.  
  5. public class Find
  6. {
  7.     public static void main(String args[])
  8.     {
  9.  
  10.         if(args.length!=3)
  11.         {
  12.  
  13.             System.exit(1);
  14.         }
  15.         try
  16.         {
  17.             //read the word lists
  18.             HashSet<String> positiveWords = new HashSet<String>();
  19.             HashSet<String> negativeWords = new HashSet<String>();
  20.  
  21.             BufferedReader br = new BufferedReader(new FileReader(args[0]));
  22.  
  23.  
  24.             String line;
  25.  
  26.             while((line=br.readLine())!=null)
  27.             {
  28.  
  29.                 positiveWords.add(line.toLowerCase());
  30.  
  31.             }
  32.             br.close();
  33.  
  34.             br = new BufferedReader(new FileReader(args[1]));
  35.             line = "";
  36.             while((line=br.readLine())!=null)
  37.             {
  38.                 negativeWords.add(line.toLowerCase());
  39.             }
  40.             br.close();
  41.  
  42.             System.out.println("number of positive words read: "+positiveWords.size());
  43.             System.out.println("number of negative words read: "+negativeWords.size());
  44.  
  45.             //read each file from the input directory
  46.             File folder = new File(args[2]);
  47.             File[] listOfFiles = folder.listFiles();
  48.             if(listOfFiles!=null && listOfFiles.length > 0)
  49.             {
  50.                 for(int i=0; i<listOfFiles.length; i++)
  51.                 {
  52.                     if(listOfFiles[i].isFile()){;}
  53.                     else{continue;}
  54.  
  55.                     System.out.println("Dealing with "+listOfFiles[i]);
  56.                     System.out.println(i);
  57.                     int posCounter = 0;
  58.                     int negCounter = 0;
  59.  
  60.                     HashMap<String,Integer> positives = new HashMap<String,Integer>();
  61.                     HashMap<String,Integer> negatives = new HashMap<String,Integer>();
  62.  
  63.                     Scanner sc = new Scanner(new BufferedReader(new FileReader(listOfFiles[i])));
  64.  
  65.                     boolean isNegation = false;
  66.                     while(sc.hasNext()) 
  67.                     {
  68.                         String w = (sc.next()).toLowerCase();
  69.  
  70.  
  71.                         // Set flag if word is not
  72.                         if (w.equals("not"))
  73.                         {
  74.                             isNegation = true;
  75.                         }
  76.                         else
  77.                         {
  78.  
  79.                             if(positiveWords.contains(w))
  80.                             {
  81.                                 if (isNegation)
  82.                                 {
  83.                                     negCounter++;
  84.                                 }
  85.                                 else
  86.                                 {
  87.                                     posCounter++;
  88.  
  89.                                     if(positives.containsKey(w))
  90.                                     {
  91.                                         int v = positives.get(w).intValue();
  92.                                         v++;
  93.                                         positives.put(w,new Integer(v));
  94.                                     }
  95.                                     else
  96.                                     {
  97.                                         positives.put(w,new Integer(1));
  98.                                     }
  99.                                 }
  100.                             }
  101.  
  102.                             if(negativeWords.contains(w))  
  103.                             {
  104.                                 if (isNegation)
  105.                                 {
  106.                                     posCounter++;
  107.                                 }
  108.                                 else
  109.                                 {
  110.                                     negCounter++;
  111.                                     if(negatives.containsKey(w))
  112.                                     {
  113.                                         int v = negatives.get(w).intValue();
  114.                                         v++;
  115.                                         negatives.put(w,new Integer(1));
  116.                                     }
  117.                                     else
  118.                                     {
  119.                                         negatives.put(w,new Integer(1));
  120.                                     }
  121.                                 }
  122.                             }
  123.                         }
  124.                             isNegation = false;
  125.  
  126.                     }
  127.  
  128.                     Iterator<String> it = positives.keySet().iterator();
  129.                     while(it.hasNext())
  130.                     {
  131.                         String w = it.next();
  132.  
  133.                         System.out.println(w+": "+positives.get(w));
  134.                     }
  135.  
  136.  
  137.                     it = negatives.keySet().iterator();
  138.                     while(it.hasNext())
  139.                     {
  140.                         String w = it.next();
  141.                         System.out.println(w+": "+negatives.get(w));
  142.                     }
  143.                     System.out.println("number of positives: "+posCounter);
  144.                     System.out.println("number of negatives: "+negCounter);
  145.  
  146.                 }    
  147.             }    
  148.         }
  149.         catch(Exception e)
  150.         {
  151.             e.printStackTrace();
  152.         }
  153.  
  154.     }
  155. }
  156.  
Oct 25 '10 #1
1 4170
jkmyoung
2,057 Expert 2GB
In terms of style, I would set isNegation=false at the beginning of the loop as opposed to before the loop and at the end of it.

You definitely need more documentation, particularly around your incoming arguments.

You might also want equalsIgnoreCase instead of equals.
Oct 25 '10 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

6
by: Gustav Medler | last post by:
Hello, there is a known problem with Opera and the execution of content shown in <NOSCRIPT> tag. Everythings works fine, if there is only one simple script like:...
18
by: Mickey Segal | last post by:
On comp.lang.java.programmer we are discussing problems created for Java programs by pop-up blockers (in the thread "showDocument blocked by new microsoft pop-up blocker"). Our problem is that...
8
by: R. Smits | last post by:
I've have got this script, the only thing I want to be changed is the first part. It has to detect IE version 6 instead of just "Microsoft Internet Explorer". Can somebody help me out? I tried...
17
by: joshc | last post by:
I searched through the newsgroup for this and found the answer but wanted to make sure because of something that came up. I want the absolute value of a 'short int' so to avoid the dangers of...
9
by: Maxi | last post by:
There is a very good javascript available at the following link that displays image and text wherever mouse follows a link. http://www.dynamicdrive.com/dynamicindex4/imagetooltip.htm I have a...
6
by: George Sakkis | last post by:
It's always striked me as odd that you can express negation of a single character in regexps, but not any more complex expression. Is there a general way around this shortcoming ? Here's an example...
13
by: technocraze | last post by:
Hi community experts, Environment MS Acess, visual basic May i knw whether is there a way to create a static negation sign (-) in a textbox? Is there a possiblity that this can be done using...
6
by: sk.rasheedfarhan | last post by:
Hi , I am using regular expression in C++ code, . Negation is not working in the down loaded code. matches all characters except "a", "b", and "c] So I am in dilemma can negation work in C++...
4
by: sjpolak | last post by:
sorry, I put the original text instead of my changed one in the previous mail sorry hello, I am new to this forum and a laymen. I have awebsite www.earlyflute.com. I make baroque flutes as you...
1
by: shabda raaj | last post by:
I want to strip punctuation from text. So I am trying, ' !! ! ... ?' Which gave me all the chars which I want to replace. So Next I tried by negating the regex,
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.