Fork me on GitHub

乱码工具判断工具

乱码,指的是“由于本地计算机在用文本编辑器打开源文件时,使用了不相应字符集而造成部分或所有字符无法被阅读的一系列字符。”,造成其结果的原因是多种多样的。

乱码工具类代码如下:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/** 
 * @author 
 * @version 
 * 说明 判断中文字符串是否乱码
 */

public class ChineseUtil {
/**
  * @author mo 
  * @version 
  * 说明 判断是否是中文
 */
private static boolean isChinese(char c) {  
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);  
if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS  
|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS  
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION  
|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION  
|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS) {  
return true;  
}  
return false;  
}  
/**
  * @author mo 
  * @version 
  * 说明 判断是否乱码
 */
public static boolean isMessyCode(String strName) {  
Pattern p = Pattern.compile("\\s*|\t*|\r*|\n*");  
Matcher m = p.matcher(strName);  
String after = m.replaceAll("");  
String temp = after.replaceAll("\\p{P}", "");  
char[] ch = temp.trim().toCharArray();  
float chLength = 0 ;  
float count = 0;  
for (int i = 0; i < ch.length; i++) {  
char c = ch[i];  
if (!Character.isLetterOrDigit(c)) {  
if (!isChinese(c)) {  
count = count + 1;  
}  
chLength++;   
}  
}  
float result = count / chLength ;  
if (result > 0.4) {  
return true;  
} else {  
return false;  
}  
}
}

将上面的工具类放入带到项目中,在需要引用的项目中引入这个工具类即可

String zgbm = request.getParameter("zgbm");
        if(ChineseUtil.isMessyCode(zgbm)){
            try {
                zgbm = new String(zgbm.getBytes("ISO8859-1"),"UTF-8");
            } catch (UnsupportedEncodingException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }