csharp-pinyin 1.0.1

dotnet add package csharp-pinyin --version 1.0.1
                    
NuGet\Install-Package csharp-pinyin -Version 1.0.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="csharp-pinyin" Version="1.0.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="csharp-pinyin" Version="1.0.1" />
                    
Directory.Packages.props
<PackageReference Include="csharp-pinyin" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add csharp-pinyin --version 1.0.1
                    
#r "nuget: csharp-pinyin, 1.0.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package csharp-pinyin@1.0.1
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=csharp-pinyin&version=1.0.1
                    
Install as a Cake Addin
#tool nuget:?package=csharp-pinyin&version=1.0.1
                    
Install as a Cake Tool

csharp-pinyin

Intro

csharp-pinyin is a lightweight Chinese/Cantonese to Pinyin library.

Chinese dialects can be used to create their own dictionaries using makedict.

Initial version algorithm reference zh_CN, and undergo significant optimization.

pinyin-makedict is the tool for creating Chinese/Cantonese dictionaries.

Feature

  • Interface reference pypinyin

  • Only Unicode within the range of [ 0x4E00 - 0x9FFF ] is supported.

  • Segmentation for heteronym words.

  • Support Traditional and Simplified Chinese.

  • Speed is very fast, about 500,000 words/s.

  • Achieved an accuracy rate of 99.9% on a 200000 word Lyrics-Pinyin test set Without-Tone.

  • The With-Tone test on CPP_Dataset(about 79k sentences) achieved an accuracy of 90.3%, while the accuracy of pypinyin was approximately 87%.

Usage

using Pinyin;

Pinyin.Pinyin pinyinInstance = Pinyin.Pinyin.Instance; // or Pinyin.Jyutping.Instance;
string hans = "明月@1几32时有##一";
PinyinResVector pinyinRes = pinyinInstance.HanziToPinyin(key, ManTone.Style.NORMAL, Error.Default, false, false, false);

List<string> pinyin = pinyinInstance.GetDefaultPinyin("了", ManTone.Style.TONE3, false, false);

Doc

//  include/ChineseG2p.cs
public struct PinyinRes
{
    public string hanzi;               //  utf-16 string
    public string pinyin;              //  utf-16 string
    public List<string> candidates;    //  Candidate pinyin of Polyphonic Characters.
    public bool error;                 //  Whether the conversion failed.
};

public class PinyinResList : List<PinyinRes>
{
public:
    //  Convert to utf-16 string list.
    public List<string> ToStrList();
    //  Convert to utf-16 string with delimiter(default: " ").
    public string ToStr(string delimiter = " ");
};

//  ChineseG2p.cs
  enum class Error {
      // Keep original characters
      Default = 0,
      // Ignore this character (do not export)
      Ignore = 1
  };

/*
    @param hans : raw utf-16 string.
    @param ManTone.Style : Preserve the pinyin tone.
    @param errorType : Ignore words that have failed conversion. Default: Keep original.
    @param candidates : Return all possible pinyin candidates. Default: true.
    @param v_to_u : Convert v to ü. Default: false.
    @param neutral_tone_with_five : Use 5 as neutral tone. Default: false.
    @return PinyinResList.
*/
public PinyinResList HanziToPinyin(string hans,
                                        ManTone.Style style = ManTone.Style.TONE,
                                        Error error = Error.Default, bool candidates = true,
                                        bool vToU = false, bool neutralToneWithFive = false);

/*
    @param hans : raw utf-16 List<string>, each element of the vector is a character.
    ...
    @return PinyinResList.
*/
public PinyinResList HanziToPinyin(List<string> hans,
                                        ManTone.Style style = ManTone.Style.TONE,
                                        Error error = Error.Default, bool candidates = true,
                                        bool vToU = false, bool neutralToneWithFive = false);

//  Convert to Simplified Chinese.  utf-8 std::string
string TradToSim(string text);

//  Determine if it is a polyphonic character.
bool IsPolyphonic(string text);

//  Get a pronunciation list.
public List<string> GetDefaultPinyin(string hanzi,
                                     ManTone.Style style = ManTone.Style.TONE,
                                     bool vToU = false, bool neutralToneWithFive = false);

Open-source softwares used

  • zh_CN The core algorithm source has been further tailored to the dictionary in this project.

  • opencpop The test data source.

  • M4Singer The test data source.

  • cc-edict The dictionary source.

  • pinyin The fan-jian dictionary source.

  • cpp_dataset The cpp_dataset source.

  • pinyin-makedict A tool for creating Chinese/Cantonese dictionaries.

  • cpp-pinyin A C++ implementation of Chinese/Cantonese to Pinyin library.

  • python-pinyin A Python implementation of Chinese/Cantonese to Pinyin library.

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.1 is compatible. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories (2)

Showing the top 2 popular GitHub repositories that depend on csharp-pinyin:

Repository Stars
stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
LiuYunPlayer/TuneLab
Version Downloads Last Updated
1.0.1 551 11/9/2024
1.0.0 16,581 9/9/2024