Question

how to find most repeated bi-grams (pairs of words) in the text by using java. (without...

how to find most repeated bi-grams (pairs of words) in the text by using java. (without using Hashmap )

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Program:

import java.util.*;
import java.io.*;

//Bigrams class
class Bigrams
{
   //main method
   public static void main (String[] args) throws IOException
   {
       //open the text file
       Scanner sc = new Scanner(new File("bigrams.txt"));

       //create an array of strings
       String text[] = new String[1000];
      
       int n;
       for(n=0; sc.hasNext(); n++)
       {
           //read a String from the file
           String s = sc.next();
           //convert to lower case
           s = s.toLowerCase();
           //remove the punctuation
           s = s.replaceAll("\\p{Punct}","");
           text[n] = s;
       }
       //declare the array of count
       int count[] = new int[n-1];
       //declare the array of bigrams
       String bigrams[][] = new String[n-1][2];
      
       int m = 0, j;
       //processing
       for(int i=0; i<n-1; i++)
       {
           for(j=0; j<m; j++)
           {
               //check for existing bigrams
               if(text[i].equalsIgnoreCase(bigrams[j][0]) && text[i+1].equalsIgnoreCase(bigrams[j][1]))
               {
                   count[j]++;
                   break;
               }
           }
           //for non-existing bigrams
           if(j==m)
           {
               bigrams[m][0] = text[i];
               bigrams[m][1] = text[i+1];
               count[j] = 1;
               m++;
           }
       }
      
       int max=0;
       j = 0;
       //calculate maximum frequency
       for(int i=0; i<m; i++)
       {
           if(count[i]>max)
           {
               max = count[i];
               j = i;
           }
       }
       //print the most repeated bi-grams
       System.out.println("Most repeated bi-grams: " + bigrams[j][0] + " " + bigrams[j][1]);
   }
}

bigrams.txt

The book I read was called A Wrinkle In Time. In the book there is a main character named Meg. Meg and her brother Charles Wallace and a guy named Calvin go on a trip across time and space. They are trying to save their father, a scientist. The dad has been captured by a creature in another galaxy. The kids save the dad and go home using a tesseract.

Output:

Most repeated bi-grams: the book

Add a comment
Know the answer?
Add Answer to:
how to find most repeated bi-grams (pairs of words) in the text by using java. (without...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • how can i reword or state this nursing diagnosis without have too many words repeated? “potential...

    how can i reword or state this nursing diagnosis without have too many words repeated? “potential preterm labor related to previous preterm labor as evidence by previous preterm at 21 weeks (demise)”

  • 1) If there are N words after the tokenization process, how many bi-grams and tri-grams can be ge...

    1) If there are N words after the tokenization process, how many bi-grams and tri-grams can be generated a) N-1, N-2 b) N-2, N-1 c) N, N-1 d)N-2,N-3 ------------------------------------------------------------------------ ------------------------------------------------------------------------ 2) Regarding the Document Term Matrix(DTM) which of the following is true? a) Each value(typically) contains the number of appearances of that term in that document b) each row represents one term c) each column represents one document ------------------------------------------------------------------------ ------------------------------------------------------------------------ 3) “unnest_tokens" function is used to reduce the words to...

  • Find the Nearest Repeated Entries in an Array People do not like reading text in which...

    Find the Nearest Repeated Entries in an Array People do not like reading text in which a word is used multiple times in a short paragraph. You are to write a program which helps identify such a problem. Write a program that takes as input an array and finds the distance between closest pairs of equal entries. For example if s = <"All, "work", "and", "no", "play", "makes", "for", "no", "work", "and", "no", "fun", "and", "no", "results">, then the second...

  • Using a doubly linked list, create a list L1 with words from a text file in...

    Using a doubly linked list, create a list L1 with words from a text file in Java.

  • How do I write a java code that mimics charAt without using java API just primitives...

    How do I write a java code that mimics charAt without using java API just primitives and no charAt to be used? I know it comes from primitives but I am confused on how to assemble the loops to derive my own charAt code

  • I need help parsing a large text file in order to create a map using Java....

    I need help parsing a large text file in order to create a map using Java. I have a text file named weather_report.txt which is filled with hundreds of different indexes. For example: one line is "POMONA SUNNY 49 29 46 NE3 30.46F". There are a few hundred more indexes like that line with different values in the text file and they are not delimited by commas but instead by spaces. Therefore, in this list of indexes we only care...

  • without using map 1. Write a C++ program to find out the top 10 words in...

    without using map 1. Write a C++ program to find out the top 10 words in terms of number of appearances in a given file, named “picasso.txt”. The data file is to be downloaded from iLMS system (http://lms.nthu.edu.tw). (Hint: The most efficient way to handle this problem is to build a word dictionary using class map in STL (Standard Template Library) if you know how to do it. On the other hand, without using map, it is still possible to...

  • using java find the third most frequent word in a paragraph in an array list. also...

    using java find the third most frequent word in a paragraph in an array list. also print the sentences that include this word. the paragraph is stored in an array list. you have to search within the array list the third most used word.

  • Using Java how would I write a program that reads and writes from binary or text...

    Using Java how would I write a program that reads and writes from binary or text files and gives me an output similar to this? Example Output: --------------------Configuration: <Default>-------------------- Enter the file name: kenb Choose binary or text file(b/t): b Choose read or write(r/w): w Enter a line of information to write to the file: lasdklj Would you like to enter another line? Y/N only n Continue? (y/n)y Enter the file name: kenb Choose binary or text file(b/t): b Choose...

  • Problem How many four-letter code words are possible using the letters in IOWA if (a) The...

    Problem How many four-letter code words are possible using the letters in IOWA if (a) The letters may not be repeated? (b) The letters may be repeated?

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT