WeClone is an open-source project aimed at creating personalized AI digital avatars using users' WeChat chat records and voice data.
Personalized conversation cloning: Use the PyWxDump tool to export WeChat chat records as CSV files, automatically process them into question-and-answer pairs, and filter out sensitive information. Based on large language models such as Qwen2.5-7B and ChatGLM3-6B, it employs LoRA fine-tuning technology to train a realistic conversational style with just 16GB of GPU memory. It can adjust responses based on conversational context, mimicking the user's language style, expression habits, and more.
High-precision voice cloning (WeClone - audio): Using lightweight models with 0.5B parameters, such as Spark - TTS, it only requires a 5-second voice sample to replicate the user's voice, supports emotion transfer and dialect imitation, and has low memory requirements of just 4GB, making it suitable for quick deployment by individual users.
Multi-platform deployment and expansion: Supports WeChat, QQ, Telegram, Feishu, and other platforms, enabling automated message responses through the AstrBot framework. Provides Docker containerized deployment and API interfaces for developers to integrate into enterprise systems, such as customer service robots. Also supports OpenAI-compatible APIs for integration with enterprise-level chatbot platforms.
Data processing and privacy protection: Implements a complete data cleaning and privacy protection mechanism, using regular expressions and other technologies to identify and remove sensitive information from chat records. Supports fine-tuning and deployment in local environments to ensure user data security.
Technical Advantages: The LoRA fine-tuning technology significantly reduces memory requirements, supports multi-card training acceleration, and employs QLoRA quantization technology to further reduce memory requirements for model training. It also supports multi-card parallel distributed training, enhancing efficiency in training with large-scale data.
Application Scenarios: For individual users, it is possible to create digital avatars of oneself or others to achieve digital immortality, preserve memories of people, or have digital avatars respond to messages on one's behalf during busy periods. For enterprise users, it can be integrated into systems such as customer service robots to provide personalized services.
The WeClone project combines LLM fine-tuning and voice cloning technology to provide a complete open-source solution for developing personalized AI assistants, offering high practical value and scalability potential. |