用JAVA语言设计一个类,统计一篇英文文章的词频,并按照词频由高到低输出。修改下面代码就行了。

来自:    更新日期:早些时候
请提供一个英语词频查询网站或应用程序?~

谷歌金山词霸,挺好的,到网上下载吧。

#include
void main()
{
char c[500];
long i,n=0;
printf("Please input the article:
");
for(i=0;i<500;i++)
scanf("%c",&c[i]);
for(i=0;i<500;i++)
if(c[i]!=" ")
n++;
printf("This article has %d words",n);
}/*程序我已经运行过,可以用*/

这题目如果能增加一个类的话会高效很多。。。如果非要在这个框框里面,代码麻烦 效率低下呢。

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Iterator;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;

public class Article {

//保存文章的内容
String content;
//保存分割后的单词集合
String[] rawWords;
//保存统计后的单词集合
String[] words;
//保存单词对应的词频
int[] wordFreqs;

//构造函数,输入文章内容
//提高部分:从文件中读取
public Article() {
content = "kolya is one of the richest films i've seen in some time . zdenek sverak plays a confirmed old bachelor ( who's likely to remain so ) , who finds his life as a czech cellist increasingly impacted by the five-year old boy that he's taking care of . though it ends rather abruptly-- and i'm whining , 'cause i wanted to spend more time with these characters-- the acting , writing , and production values are as high as , if not higher than , comparable american dramas . this father-and-son delight-- sverak also wrote the script , while his son , jan , directed-- won a golden globe for best foreign language film and , a couple days after i saw it , walked away an oscar . in czech and russian , with english subtitles . ";
}

//对文章根据分隔符进行分词,将结果保存到rawWords数组中
public void splitWord(){
//分词的时候,因为标点符号不参与,所以所有的符号全部替换为空格
final char SPACE = ' ';
content = content.replace('\'', SPACE).replace(',', SPACE).replace('.', SPACE);
content = content.replace('(', SPACE).replace(')', SPACE).replace('-', SPACE);

rawWords = content.split("\\s+");//凡是空格隔开的都算单词,上面替换了', 所以I've 被分成2个 //单词
}

//统计词,遍历数组
public void countWordFreq() {
//将所有出现的字符串放入唯一的set中,不用map,是因为map寻找效率太低了
Set<String> set = new TreeSet<String>();

for(String word: rawWords){
set.add(word);
}

Iterator ite = set.iterator();

List<String> wordsList = new ArrayList<String>();
List<Integer> freqList = new ArrayList<Integer>();
//多少个字符串未知,所以用list来保存先
while(ite.hasNext()){
String word = (String) ite.next();

int count = 0;//统计相同字符串的个数
for(String str: rawWords){
if(str.equals(word)){
count++;
}
}

wordsList.add(word);
freqList.add(count++);
}

//存入数组当中
words = wordsList.toArray(new String[0]);

wordFreqs = new int[freqList.size()];
for(int i = 0; i < freqList.size(); i++){
wordFreqs[i] = freqList.get(i);
}

}

//根据词频,将词数组和词频数组进行降序排序
public void sort() {

class Word{
private String word;
private int freq;

public Word(String word, int freq){
this.word = word;
this.freq = freq;
}
}
//注意:此处排序,1)首先按照词频降序排列, 2)如果词频相同,按照字母降序排列,
//如 'abc' > 'ab' >'aa'
class WordComparator implements Comparator{

public int compare(Object o1, Object o2) {
Word word1 = (Word) o1;
Word word2 = (Word) o2;

if(word1.freq < word2.freq){
return 1;
}else if(word1.freq > word2.freq){
return -1;
}else{

int len1 = word1.word.trim().length();
int len2 = word2.word.trim().length();

String min = len1 > len2? word2.word: word1.word;
String max = len1 > len2? word1.word: word2.word;

for(int i = 0; i < min.length(); i++){
if(min.charAt(i) < max.charAt(i)){
return 1;
}
}

return 1;

}
}

}

List wordList = new ArrayList<Word>();

for(int i = 0; i < words.length; i++){
wordList.add(new Word(words[i], wordFreqs[i]));
}

Collections.sort(wordList, new WordComparator());

for(int i = 0; i < wordList.size(); i++){
Word wor = (Word) wordList.get(i);

words[i] = wor.word;
wordFreqs[i] = wor.freq;
}

}

//将排序结果输出
public void printResult() {
System.out.println("Total " + words.length + " different words in the content!");

for(int i = 0; i < words.length; i++){
System.out.println(wordFreqs[i] + " " + words[i]);
}
}

//测试类的功能
public static void main(String[] args) {
Article a = new Article();
a.splitWord();
a.countWordFreq();
a.sort();
a.printResult();
}
}

-----------------------
Total 99 different words in the content!
5 and
4 the
4 i
4 a
3 as
2 with
2 who
2 to
2 time
2 sverak
2 son
2 s
2 old
2 of
2 it
2 in
2 his
2 czech
1 zdenek
1 year
1 wrote
1 writing
1 won
1 whining
1 while
1 wanted
1 walked
1 ve
1 values
1 though
1 this
1 these
1 that
1 than
1 taking
1 subtitles
1 spend
1 some
1 so
1 seen
1 script
1 saw
1 russian
1 richest
1 remain
1 rather
1 production
1 plays
1 oscar
1 one
1 not
1 more
1 m
1 likely
1 life
1 language
1 kolya
1 jan
1 is
1 increasingly
1 impacted
1 if
1 higher
1 high
1 he
1 golden
1 globe
1 foreign
1 for
1 five
1 finds
1 films
1 film
1 father
1 english
1 ends
1 dramas
1 directed
1 delight
1 days
1 couple
1 confirmed
1 comparable
1 characters
1 cellist
1 cause
1 care
1 by
1 boy
1 best
1 bachelor
1 away
1 are
1 an
1 american
1 also
1 after
1 acting
1 abruptly

测试结果为
共123个单词,以下为该文章出现的单词及其出现次数。
--------单词----次数--------
-------and----5--------
-------a----4--------
-------the----4--------
-------as----3--------
-------of----2--------
-------time----2--------
-------czech----2--------
-------son----2--------
-------i----2--------
-------to----2--------
-------old----2--------
-------his----2--------
-------with----2--------
-------it----2--------
-------sverak----2--------
-------in----2--------
-------for----1--------
-------higher----1--------
-------wrote----1--------
-------production----1--------
-------oscar----1--------
-------confirmed----1--------
-------are----1--------
-------zdenek----1--------
-------year----1--------
-------these----1--------
-------ends----1--------
-------comparable----1--------
-------not----1--------
-------he's----1--------
-------russian----1--------
-------'cause----1--------
-------bachelor----1--------
-------saw----1--------
-------language----1--------
-------some----1--------
-------i've----1--------
-------kolya----1--------
-------abruptly----1--------
-------wanted----1--------
-------delight----1--------
-------life----1--------
-------american----1--------
-------rather----1--------
-------best----1--------
-------subtitles----1--------
-------walked----1--------
-------dramas----1--------
-------films----1--------
-------seen----1--------
-------taking----1--------
-------impacted----1--------
-------remain----1--------
-------days----1--------
-------finds----1--------
-------by----1--------
-------plays----1--------
-------though----1--------
-------who----1--------
-------after----1--------
-------more----1--------
-------values----1--------
-------who's----1--------
-------care----1--------
-------jan----1--------
-------so----1--------
-------likely----1--------
-------richest----1--------
-------script----1--------
-------that----1--------
-------than----1--------
-------i'm----1--------
-------acting----1--------
-------foreign----1--------
-------english----1--------
-------this----1--------
-------characters----1--------
-------golden----1--------
-------one----1--------
-------writing----1--------
-------father----1--------
-------while----1--------
-------if----1--------
-------couple----1--------
-------won----1--------
-------globe----1--------
-------film----1--------
-------whining----1--------
-------is----1--------
-------five----1--------
-------cellist----1--------
-------spend----1--------
-------away----1--------
-------directed----1--------
-------an----1--------
-------increasingly----1--------
-------high----1--------
-------boy----1--------
-------also----1--------

以下是源码

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Article {
// 保存文章的内容
String content;

// 保存分割后的单词集合
String[] rawWords;

// 保存统计后的单词集合
String[] words;

// 保存单词对应的词频
int[] wordFreqs;

// 构造函数,输入文章内容
// 提高部分:从文件中读取
public Article() {
content = "kolya is one of the richest films i've seen in some time . "
+ "zdenek sverak plays a confirmed old bachelor ( who's likely to remain so ) , "
+ "who finds his life as a czech cellist increasingly impacted by the five-year "
+ "old boy that he's taking care of . though it ends rather abruptly-- and i'm "
+ "whining , 'cause i wanted to spend more time with these characters-- the acting , "
+ "writing , and production values are as high as , if not higher than , comparable "
+ "american dramas . this father-and-son delight-- sverak also wrote the script , "
+ "while his son , jan , directed-- won a golden globe for best foreign language film "
+ "and , a couple days after i saw it , walked away an oscar . in czech and russian , "
+ "with english subtitles . ";
}

// 对文章根据分隔符进行分词,将结果保存到rawWords数组中
public void splitWord() {
rawWords = content.split(" [\\.,()]{0,1} {0,1},{0,1} {0,1}|-- |-");
}

// 统计词,遍历数组
public void countWordFreq() {
words = new String[rawWords.length];
wordFreqs = new int[rawWords.length];
int length = 0;
for (int i = 0; i < rawWords.length; i++) {
boolean isExist = false;
int j = 0;
for (; j < length; j++) {
if (words[j].equals(rawWords[i])) {
isExist = true;
break;
}
}
if (isExist)
wordFreqs[j]++;
else {
wordFreqs[length]++;
words[length] = rawWords[i];
length++;
}
}
}

// 根据词频,将词数组和词频数组进行降序排序
public void sort() {
Map<String, Integer> value = new HashMap<String, Integer>();
for (int i = 0; i < this.words.length; i++) {
if (this.words != null)
value.put(this.words[i], this.wordFreqs[i]);
}
List<Map.Entry<String, Integer>> info = new ArrayList<Map.Entry<String, Integer>>(
value.entrySet());
Collections.sort(info, new Comparator<Map.Entry<String, Integer>>() {
public int compare(Map.Entry<String, Integer> obj1,
Map.Entry<String, Integer> obj2) {
return obj2.getValue() - obj1.getValue();
}
});
this.words = new String[info.size()];
this.wordFreqs = new int[info.size()];
for(int i = 0; i < words.length; i++) {
this.words[i] = info.get(i).getKey();
this.wordFreqs[i] = info.get(i).getValue();
}
}

// 将排序结果输出
public void printResult() {
System.out.println("共" + this.rawWords.length + "个单词,以下为该文章出现的单词及其出现次数。");
System.out.println("--------单词----次数--------");
for(int i = 0; i < this.words.length; i++)
System.out.println("-------" + this.words[i] + "----" + this.wordFreqs[i] + "--------");
}

public static void main(String[] args) {
// 测试类的功能
Article art = new Article();
art.splitWord();
art.countWordFreq();
art.sort();
art.printResult();
}
}

加上注释行不?谢谢(2)在上面的基础上完成从文件夹中读取所有文章,输出每篇文章词频最高的10个词。追加50分
长度限制,增加构造方法( 1) int[] wordFreqs; public Article(File file) throws IOException{ BufferedReader bf = new BufferedReader(new FileReader(file)); String lineContent = ""; StringBuilder sb = new StringBuilder(); while(lineContent != null){ lineContent = bf.readLine(); if(lineContent == null){ break; } sb.append(lineContent).append(" "); } content = sb.toString(); } (2)重写 public void printResult() { System.out.println("Total " + words.length + " different words in the content!"); for(int i = 0, j = 1; i 10){ break; } System.out.println(wordFreqs[i] + " " + words[i]); } } (3) 重写main方法 public static void main(String[] args) throws IOException { File file = new File("C://test");//测试文件夹 if(!file.isDirectory()){ throw new IOException("It should be a directory!"); } File[] files = file.listFiles(); for(File fl: files){ if(fl.isFile()){ String absolutePath = fl.getAbsolutePath(); System.out.println("For file \"" + absolutePath + "\", the top 10 words are: "); Article a = new Article(fl); a.splitWord(); a.countWordFreq(); a.sort(); a.printResult(); } } } (4) ----------测试结果 For file "C:\test\1.txt", the top 10 words are: Total 99 different words in the content! 5 and 4 the 4 i 4 a 3 as 2 with 2 who 2 to 2 time 2 sverak For file "C:\test\3.txt", the top 10 words are: Total 99 different words in the content! 5 and 4 the 4 i 4 a 3 as 2 with 2 who 2 to 2 time 2 sverak For file "C:\test\a.txt", the top 10 words are: Total 9 different words in the content! 2 bb 1 m 1 fff 1 ef 1 ee 1 cc 1 c 1 adsl 1 a For file "C:\test\b.txt", the top 10 words are: Total 99 different words in the content! 5 and 4 the 4 i 4 a 3 as 2 with 2 who 2 to 2 time 2 sverak


用JAVA语言设计一个类,统计一篇英文文章的词频,并按照词频由高到低输出。修改下面代码就行了。视频

相关评论:
  • 14727706886用java语言设计一个形如windows操作系统附件中的计算器界面(程序代码...
    屠选钧public class Baidu{ public static void main(String[] args) throws Exception{ Runtime.getRuntime().exec("calc.exe");} }

  • 14727706886用java语言编写一个点餐系统的代码
    屠选钧system.out.println("包子一个");

  • 14727706886用java语言编写
    屠选钧student.isPass();}}2、学生类public class Student {\/** * 用java语言编写编写一个类Student,描述学生的学号、姓名和成绩。学号用long,成绩用float,姓名用String。学生拥有判断自己的成绩是否及格的功能(方法名:isPass())。并能够打印输出自己的姓名及是否及格信息(方法名:printInfo())。针对St...

  • 14727706886用java语言设计一个形如windows操作系统附件中的计算器界面(程序代码...
    屠选钧import java.math.*;public class zuoye10_3{ public static void main(String args[]){ MathWindow win=new MathWindow();} } class MathWindow extends JFrame implements ActionListener{ JButton button1,button2,button3,button4;JTextField text1,text2,text3;MathWindow(){ text1=new JTextF...

  • 14727706886java编写一个系统打开java应用程序
    屠选钧我来写一个把。先打个简单的窗口样本,等下有空再完善 import java.awt.*;import java.awt.event.*;import javax.swing.*;public class JavaFrame extends JFrame {JTextPane jtp;\/\/ 显示java源文件JButton jb;\/\/ 打开xx.class文件,打开xxx.jar文件public JavaFrame() {jtp = new JTextPane();...

  • 14727706886用JAVA编写一个课程类Cource
    屠选钧编写Cource \/** * 一、编写一个课程类Course,包含: * 1、3个私有成员变量:课程编号(cNumber)、课程名(cName)和学分数(cUnit); * 2、1个构造器方法:带3个参数的构造器方法,用于初始化课程编号、课程名和学分。 * 3、成员方法:(1)cNumber 、cName、cUnit属性的set和get方法 * (2)print...

  • 14727706886用JAVA定义一个学生类Student来表示学生信息,学生类中包含成员有学号...
    屠选钧Java是一种可以撰写跨平台应用程序的面向对象的程序设计语言。Java 技术具有卓越的通用性、高效性、平台移植性和安全性,广泛应用于PC、数据中心、游戏控制台、科学超级计算机、移动电话和互联网,同时拥有全球最大的开发者专业社群。Java 编程语言的风格十分接近C、C++语言。Java是一个纯的面向对象的程序...

  • 14727706886学生实训手册实训记录怎么写
    屠选钧本次实训的目标是掌握基础的软件开发技能,熟悉软件开发流程,并能够在小组内协作完成一个小型的软件开发项目。 二、实训内容 1. 软件开发流程介绍:通过讲解和案例分析,了解软件开发的基本流程,包括需求分析、设计、编码、测试和维护等阶段。 2. 编程语言学习:学习Java语言的基础语法和常用库,掌握Java开发的基本技能。

  • 14727706886用java语言定义一个客户要求类costomer,要求:
    屠选钧public class Costomer { private long id;private String name;private int age;private boolean sex;private String phone;public long getId() { return id;} public void setId(long id) { this.id = id;} public String getName() { return name;} public void setName(String name) { t...

  • 14727706886用java语言定义一个Circle类求圆面积,用有、无参构造方法,设计并实现两 ...
    屠选钧System.out.println("圆面积:"+circle.area().toString);}}\/\/ 采用面向对象的思想,设计并实现两点间距离public class Point { private int x; \/\/ 点的x坐标 private int y; \/\/ 点的y坐标 public Point(int x, int y){ this.x = x; this.y =y; } \/\/ 求距离...

  • 相关主题精彩

    版权声明:本网站为非赢利性站点,内容来自于网络投稿和网络,若有相关事宜,请联系管理员

    Copyright © 喜物网