CopyCatcher 1.0.0-alpha
dotnet add package CopyCatcher --version 1.0.0-alpha
NuGet\Install-Package CopyCatcher -Version 1.0.0-alpha
<PackageReference Include="CopyCatcher" Version="1.0.0-alpha" />
paket add CopyCatcher --version 1.0.0-alpha
#r "nuget: CopyCatcher, 1.0.0-alpha"
// Install CopyCatcher as a Cake Addin
#addin nuget:?package=CopyCatcher&version=1.0.0-alpha&prerelease
// Install CopyCatcher as a Cake Tool
#tool nuget:?package=CopyCatcher&version=1.0.0-alpha&prerelease
Copy Catcher
Table of Contents:
<a name="overview"></a> Overview
Copy Catcher
is a NuGet package designed to identify and list duplicate files within a specified directory. It uses advanced techniques and optimizations to ensure efficient and accurate detection of files with identical content.
<a name="keybenefits"></a> Key Benefits & Features
Buffered Reading:
Copy Catcher
uses buffered reading to efficiently read large files in chunks, reducing memory usage and enhancing performance.Asynchronous Operations: The package is designed to leverage asynchronous operations, ensuring non-blocking I/O operations. This results in a smoother user experience, especially when dealing with large directories or files.
Early Byte Exiting: Before hashing the entire file,
Copy Catcher
checks the initial bytes of files. If two files have different initial bytes, they are immediately identified as distinct, saving computational resources.Chunk Hashing: Instead of hashing the entire file in one go,
Copy Catcher
hashes files in chunks. This approach is more memory-efficient and allows for faster identification of large duplicate files.Parallelism: The package employs parallel processing to scan and hash multiple files concurrently. This takes full advantage of multi-core processors, drastically reducing the time required to identify duplicates in large directories.
<a name="gettingstarted"></a> Getting Started
<a name="prerequisites"></a>
Prerequisites
- .NET SDK installed on your machine.
- A .NET project where you want to use
Copy Catcher
.
<a name="installation"></a>
Installation
Install the Copy Catcher
NuGet package using the NuGet Package Manager:
Install-Package CopyCatcher
Or using the .NET CLI:
dotnet add package CopyCatcher
Usage
<a name="integration"></a>
Integration
In your .NET project, add the following using directive:
using CopyCatcher.Shared;
Create an instance of the DuplicateFinderService
:
var service = new DuplicateFinderService("path/to/directory");
Call the FindDuplicates
method:
var duplicates = service.FindDuplicates();
<a name="output"></a>
Output
The FindDuplicates
method will return a dictionary where keys are hash values and values are lists of file paths that have the same hash:
{
"abc123def456": ["path/to/duplicate1.txt", "path/to/duplicate2.txt"],
...
}
<a name="console-app-example"></a>
Console App Example
A simple .NET Console app using Copy Catcher would look like this:
using CopyCatcher;
Console.WriteLine("Enter the directory path:");
var directoryPath = Console.ReadLine();
// Initialize the service and find duplicates
var duplicateFinderService = new DuplicateFinderService(directoryPath);
var duplicates = duplicateFinderService.FindDuplicates();
// Display results
foreach (var duplicate in duplicates)
{
Console.WriteLine($"Hash: {duplicate.Key}");
foreach (var filePath in duplicate.Value)
{
Console.WriteLine($" - {filePath}");
}
}
How It Works
<a name="components"></a>
Components
- FileReader: Reads files from the file system.
- FileHasher: Computes a hash value for each file to determine duplicates.
- DirectoryScanner: Scans the specified directory and retrieves a list of all files. It uses the
DirectoryProvider
to access the file system, ensuring better testability and separation of concerns. - DirectoryProvider: Provides direct access to the file system, used by
DirectoryScanner
. - DuplicateFinderService: The main service that ties all components together and provides an easy-to-use interface for finding duplicates.
<a name="workflow"></a>
Workflow
- The user specifies a directory to be scanned.
DirectoryScanner
retrieves a list of all files in the directory.FileHasher
computes a hash for each file.- Duplicate files are identified based on their hash values and returned in a dictionary.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
-
net7.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
1.0.0-alpha | 182 | 10/31/2023 |