csharp-pinyin
1.0.1
dotnet add package csharp-pinyin --version 1.0.1
NuGet\Install-Package csharp-pinyin -Version 1.0.1
<PackageReference Include="csharp-pinyin" Version="1.0.1" />
<PackageVersion Include="csharp-pinyin" Version="1.0.1" />
<PackageReference Include="csharp-pinyin" />
paket add csharp-pinyin --version 1.0.1
#r "nuget: csharp-pinyin, 1.0.1"
#:package csharp-pinyin@1.0.1
#addin nuget:?package=csharp-pinyin&version=1.0.1
#tool nuget:?package=csharp-pinyin&version=1.0.1
csharp-pinyin
Intro
csharp-pinyin is a lightweight Chinese/Cantonese to Pinyin library.
Chinese dialects can be used to create their own dictionaries using makedict.
Initial version algorithm reference zh_CN, and undergo significant optimization.
pinyin-makedict is the tool for creating Chinese/Cantonese dictionaries.
Feature
Interface reference pypinyin
Only Unicode within the range of [ 0x4E00 - 0x9FFF ] is supported.
Segmentation for heteronym words.
Support Traditional and Simplified Chinese.
Speed is very fast, about 500,000 words/s.
Achieved an accuracy rate of 99.9% on a 200000 word Lyrics-Pinyin test set Without-Tone.
The With-Tone test on CPP_Dataset(about 79k sentences) achieved an accuracy of 90.3%, while the accuracy of pypinyin was approximately 87%.
Usage
using Pinyin;
Pinyin.Pinyin pinyinInstance = Pinyin.Pinyin.Instance; // or Pinyin.Jyutping.Instance;
string hans = "明月@1几32时有##一";
PinyinResVector pinyinRes = pinyinInstance.HanziToPinyin(key, ManTone.Style.NORMAL, Error.Default, false, false, false);
List<string> pinyin = pinyinInstance.GetDefaultPinyin("了", ManTone.Style.TONE3, false, false);
Doc
// include/ChineseG2p.cs
public struct PinyinRes
{
public string hanzi; // utf-16 string
public string pinyin; // utf-16 string
public List<string> candidates; // Candidate pinyin of Polyphonic Characters.
public bool error; // Whether the conversion failed.
};
public class PinyinResList : List<PinyinRes>
{
public:
// Convert to utf-16 string list.
public List<string> ToStrList();
// Convert to utf-16 string with delimiter(default: " ").
public string ToStr(string delimiter = " ");
};
// ChineseG2p.cs
enum class Error {
// Keep original characters
Default = 0,
// Ignore this character (do not export)
Ignore = 1
};
/*
@param hans : raw utf-16 string.
@param ManTone.Style : Preserve the pinyin tone.
@param errorType : Ignore words that have failed conversion. Default: Keep original.
@param candidates : Return all possible pinyin candidates. Default: true.
@param v_to_u : Convert v to ü. Default: false.
@param neutral_tone_with_five : Use 5 as neutral tone. Default: false.
@return PinyinResList.
*/
public PinyinResList HanziToPinyin(string hans,
ManTone.Style style = ManTone.Style.TONE,
Error error = Error.Default, bool candidates = true,
bool vToU = false, bool neutralToneWithFive = false);
/*
@param hans : raw utf-16 List<string>, each element of the vector is a character.
...
@return PinyinResList.
*/
public PinyinResList HanziToPinyin(List<string> hans,
ManTone.Style style = ManTone.Style.TONE,
Error error = Error.Default, bool candidates = true,
bool vToU = false, bool neutralToneWithFive = false);
// Convert to Simplified Chinese. utf-8 std::string
string TradToSim(string text);
// Determine if it is a polyphonic character.
bool IsPolyphonic(string text);
// Get a pronunciation list.
public List<string> GetDefaultPinyin(string hanzi,
ManTone.Style style = ManTone.Style.TONE,
bool vToU = false, bool neutralToneWithFive = false);
Open-source softwares used
zh_CN The core algorithm source has been further tailored to the dictionary in this project.
opencpop The test data source.
M4Singer The test data source.
cc-edict The dictionary source.
pinyin The fan-jian dictionary source.
cpp_dataset The cpp_dataset source.
Related Projects
pinyin-makedict A tool for creating Chinese/Cantonese dictionaries.
cpp-pinyin A C++ implementation of Chinese/Cantonese to Pinyin library.
python-pinyin A Python implementation of Chinese/Cantonese to Pinyin library.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
.NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.1 is compatible. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
This package has no dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories (2)
Showing the top 2 popular GitHub repositories that depend on csharp-pinyin:
Repository | Stars |
---|---|
stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
|
|
LiuYunPlayer/TuneLab
|