Board logo

标题: Chinese2PY 汉字转拼音 [打印本页]

作者: bailong360    时间: 2015-8-9 23:20     标题: Chinese2PY 汉字转拼音

  1. 用法
  2. Chinese2PY -f 文件
  3. Chinese2PY "字符串"
  4. 多音字会把拼音全部标出来(-_-||)
  5. 大部分数据来自这里
  6. http://pan.baidu.com/share/link?shareid=3895381237&uk=1124163200
  7. 值得吐槽的是,这个字库里有上万个字,但是竟然没有"特"......
  8. 无奈去网上又找了一个现代汉语词典整合
复制代码
exe下载地址 http://pan.baidu.com/s/1c0DEBF6
整合后的字库下载地址 http://pan.baidu.com/s/1i3nLYjb
-------------
优化了代码
PS 转了一个20M的小说耗时13s,但是打开发现中文标点什么的不对劲,原来npp把它竟然识别成了UTF-8...改成以GB2312编码就正常了
作者: wskwfkbdn    时间: 2016-1-18 23:56

看起来不用第三方命令行,表示效率不高,还得附带字典文件,不过人们的智慧是无穷大。
作者: 娜美    时间: 2024-6-25 14:34

本帖最后由 娜美 于 2024-7-2 21:41 编辑

楼主, 这个项目还在维护吗
找了很多圈,  都试了一遍,  感觉还是楼主你这个最好了!  可以支持多音字转换,   拼音词库全,  很多生僻字也可以转换
但有一些要改进

1.  楼主的拼音词库有一些错误
http://pan.baidu.com/s/1i3nLYjb
  1. ○ ling
  2. 慌 .huang表示难以忍受
  3. 呒 
复制代码
另外
从单个汉字转换出来的拼音看起来好像不是标准拼音规范
  1. be
  2. ceok
  3. ceom
  4. ceon
  5. ceor
  6. cis
  7. dem
  8. dim
  9. eo
  10. eol
  11. eos
  12. gib
  13. go
  14. hal
  15. hol
  16. hwa
  17. jou
  18. kal
  19. kos
  20. kweok
  21. meo
  22. myeo
  23. myeon
  24. myeong
  25. nem
  26. neus
  27. ngag
  28. ngai
  29. ngam
  30. nung
  31. oes
  32. ol
  33. on
  34. pak
  35. peol
  36. phas
  37. phdeng
  38. phoi
  39. phos
  40. ppun
  41. ram
  42. saeng
  43. sal
  44. sed
  45. sei
  46. seo
  47. seon
  48. sol
  49. tae
  50. tol
  51. uu
  52. zo
复制代码
又试着抽取了多音字转换部分拼音查看
多音字转换出来的部分拼音看起来有一些也好像不是拼音规范,不过还好,只是在生僻字上有些小小问题
  1. 髟 bia | bian | biao | piao | shankun 其中‘shankun‘不是拼音规范
  2. 欕 eom | yan 其中‘eom‘不是拼音规范
  3. 甴 gad | you | zha 其中‘gad‘不是拼音规范
  4. 哼 heng | hng 其中‘hng‘不是拼音规范
  5. 乧 dou | dul 其中‘dul‘不是拼音规范
  6. 甴 gad | you | zha 其中‘gad‘不是拼音规范
  7. 櫷 gui | kwi 其中‘kwi‘不是拼音规范
  8. 浼 mei | mel 其中‘mel‘不是拼音规范
  9. 嗯 en | n | ng 其中‘n和ng‘不是拼音规范
  10. 昷 on | wen 其中‘on‘不是拼音规范
  11. 挼 luo | rua | ruo | sui 其中‘rua‘不是拼音规范
  12. 乷 sal | sha 其中‘sal‘不是拼音规范
  13. 涁 lin | qin | sei | shen 其中‘sei‘不是拼音规范
  14. 垈 dai | tae 其中‘tae‘不是拼音规范
  15. 折 she | shw | ti | zhe 其中‘shw‘不是拼音规范
  16. 獤 dun | ton 其中‘ton‘不是拼音规范
  17. 膸 sui | wie 其中‘wie‘不是拼音规范
  18. 曱 yue | zad 其中‘zad‘不是拼音规范
  19. 咗 zo | zuo 其中‘zo‘不是拼音规范
  20. 褡 d | da 其中‘d‘不是拼音规范
  21. 乁 i | ji | yi 其中‘i‘不是拼音规范
  22. 嗯 en | n | ng 其中‘ng‘  和 ”n‘不是拼音规范
  23. 瑁 mao | q 其中‘q‘不是拼音规范
复制代码
2. 转换多音字的分隔符最好转用 ","中逗号分开      有距离看起来容易分辩, 再加中括号更容易分辨。
  1. 例如: 这是一个字的多音字, 有空间距离感看起来容易分辩
  2. 【yan,yao,yin】 ri shen
复制代码
多音字分隔使用如果 "|"   距离过于紧密, 看起来容易眼花潦乱
  1. huang|kang yan|yao|yin ri shen
复制代码
3.  建议后续继承维护者将拼音词库与代码分离, 可以方便编辑拼音词库/更正/添加 等操作
  1. 在楼主的拼音词库基础上 新增几行拼音词库
  2. 慌 huang
  3. 欸 ei
  4. 睖 ling
  5. 碐 ling
  6. 稜 ling
  7. 羐 ling
  8. 誒 ei
  9. 诶 ei
复制代码
提供420种标准拼音规范,基本可以覆盖所有生僻字了
  1. "a","wen","ming","hua","wei","xiao","hai","guo","hong","jun","yu","jian","chun","ping","zhi","lin","yun","jin","rong","yong","xin","dong","ying","cheng","li","long","de","feng","jie","fang","hui","qing","zhong","min","sheng","guang","qiang","yan","xiang","xiu","ling","fei","liang","jia","xing","mei","bao","xue","bo","bin","ya","jiang","peng","chao","xia","rui","fu","zheng","zhen","lan","song","an","juan","tao","qiu","gang","jing","zi","shi","chang","yuan","yi","bing","tian","qin","wu","xu","ze","yang","quan","you","hao","gui","kai","qun","yue","ning","ai","ren","si","shun","xian","pei","shan","gen","da","kun","yin","dan","shu","chuan","lian","xi","ting","fen","ji","zong","na","meng","chen","fa","xiong","cai","shao","qi","ke","le","ru","lei","kang","he","yao","zhao","wan","heng","hu","ju","mao","han","nan","shuang","qiong","gao","en","lai","cui","zeng","sen","shui","zhe","zhang","dao","su","huai","zu","fan","qiao","ye","shou","qian","cun","wang","run","kui","huan","ding","cong","ran","tong","zhan","zhou","jiao","zhu","ben","e","bi","bai","zhuo","nian","jiu","lu","lun","di","chong","xuan","tie","shang","shuai","ni","biao","man","hang","ruo","ri","deng","can","guan","tai","tang","liu","nai","bang","hou","neng","er","xun","zuo","san","kuan","mu","gong","miao","chu","teng","shen","sai","pin","bei","dian","dai","pu","zai","sha","duo","ceng","suo","chan","zan","ge","shuo","geng","jiong","huang","duan","zhuang","nv","huo","chi","pan","lie","she","ci","bu","gai","jue","tuan","dun","gan","lang","nong","gu","luo","kong","sun","po","chuang","ce","zun","kan","pi","lou","mou","cen","ma","cang","mi","dang","te","tu","lv","ang","sui","ti","kou","dui","nuo","mang","ao","dou","ou","shuan","niu","rang","la","mo","die","zhuan","rou","sang","kuo","xie","ka","du","luan","ku","mian","zao","chai","tuo","cao","wa","qu","tan","zhun","mai","kao","chui","bian","nuan","keng","piao","wai","kuang","cha","ban","kuai","fo","diao","sa","ba","liao","rao","men","leng","ta","lao","zuan","pian","che","zhai","ha","pai","gou","wo","tou","zhui","nen","se","re","sao","beng","nie","qia","ga","hei","pang","niang","zui","chou","niao","zou","weng","zha","que","sou","qie","nei","tiao","ken","cuo","tui","nao","tun","hun","nu","hen","shai","reng","ruan","nang","me","miu","cou","ne","suan","pao","o","gun","pie","guai","bie","pen","gua","cu","mie","pa","seng","gei","kua","zang","za","fou","zhuai","diu","cuan","zhua","ca","ei","chuo","yo","shua","pou","nin","zei","chuai","zen","lo","nou","dei","den","ron","chua","dia","eng","lia","ho","ki","ko","so","to","ra","ro","tei","lue","nue","nun","shei","zhei","lve","nve"
复制代码
不知道有人愿意来继承维护更新这个项目不




欢迎光临 批处理之家 (http://bathome.net./) Powered by Discuz! 7.2