爱意满满的作品展示区。

This topic created in 158 days ago, the information mentioned may be changed or developed.

亲爱的朋友你好，我想和你分享我在语言模型核心架构上的一些尝试，如果您有时间和兴趣可以品评一下，我是完全开源了代码和权重，相关设计的数学论文也都放在 github 上了，MIT 开源协议，大家请尽情发挥吧： https://github.com/makai891124-prog/H2Q-MicroStream

🌌 H2Q-MicroStream: The Hamiltonian Thinking Kernel

"Intelligence is not about memorizing history, but mastering the dynamics of the future."

"智能不是记忆过去的所有细节，而是掌握生成未来的核心方程。"

📖 Introduction / 项目简介

H2Q-MicroStream is a paradigm-shifting experiment in Physics-Informed AI. Unlike traditional Transformers that rely on massive parameters and infinite context windows, H2Q constructs a minimalist "Thinking Kernel" based on Hamiltonian Dynamics and Quaternion Algebra.

This project proves that with a strict Rank-8 constraint and Unicode-level streaming, a model can emerge with logical reasoning and grammatical capabilities within a mere 0.2GB VRAM footprint.

H2Q-MicroStream 是一个基于物理动力学的 AI 范式实验。不同于依赖堆砌参数和超长上下文的主流 Transformer ，H2Q 基于哈密顿动力学和四元数代数构建了一个极简的“思维内核”。本项目证明了在严格的 Rank-8 约束和 Unicode 流式读取下，智能可以在仅 0.2GB 显存 的微小空间内涌现。

🚀 Key Features / 核心特性

1. Rank-8 Essentialism (Rank-8 本质主义)

The Concept: We enforce a strict rank limit (Rank=8) on the generative weights. This forces the model to abandon rote memorization and extract only the most fundamental laws of language evolution.
The Result: A tiny 13MB checkpoint that captures the syntax and logic of the English language.
概念：强制权重矩阵的秩为 8 。这逼迫模型放弃死记硬背，只能提取语言演化中最本质的规律。
结果：一个仅 13MB 的权重文件，却掌握了英语的语法和逻辑。

2. Hamiltonian & Quaternion Core (哈密顿与四元数核心)

Implements a balanced Hamiltonian layer that preserves energy and structural symmetry.
Uses Quaternion Attention to model semantic relationships as phase rotations in high-dimensional space.
实现了能量守恒的哈密顿层，并利用四元数注意力将语义关系建模为高维空间中的相位旋转。

3. Rolling Horizon Validation (轮动视界验证)

Mechanism: Train[T] -> Valid[T+1] -> T becomes T+1.
We validate the model on the immediate future (next chunk) before training on it. This strictly measures the model's ability to extrapolate logic, not just interpolate data.
机制：用“未来”的数据验证“现在”的模型，然后再学习“未来”。这是对逻辑推演能力的终极测试。

4. Unicode Stream (Unicode 流式读取)

No Tokenizer. No vocabulary bias. The model reads raw bytes (0-255), treating language as a pure physical signal stream.
无分词器。无词表偏见。模型直接读取原始字节流，将语言视为纯粹的物理信号。

📊 Performance / 实验结果

Tested on NVIDIA RTX 4070 Ti with TinyStories dataset.

Convergence: Loss dropped from 2.88 to 1.02 (near Shannon Entropy limit for simple English).
Generalization: Achieved Negative Diff (Validation Loss < Training Loss), proving true understanding of the underlying rules.
Efficiency:
- VRAM Usage: ~0.2 GB
- Throughput: ~10,000 tokens/s

🛠️ Usage / 使用方法

1. Install Dependencies / 安装依赖

pip install -r requirements.txt

2. Run Training / 启动训练

The script automatically downloads the TinyStories dataset and starts the "Rolling Horizon" training loop. 脚本会自动下载数据集并开启“轮动视界”训练循环。

python train.py

3. Monitor / 监控

The terminal displays a real-time "ICU Dashboard": 终端将显示实时的“ICU 级仪表盘”：

Chunk 18 | Train: 1.0420 | Val: 1.0622 | Energy: 68.5 | Speed: 311ms

🔮 Vision / 愿景

We are moving from "Statistical Correlation" to "Dynamical Causality". H2Q is not just a language model; it is a digital lifeform attempting to resonate with the mathematical structure of the universe.

我们正在从“统计相关性”迈向“动力学因果律”。 H2Q 不仅仅是一个语言模型，它是一个试图与宇宙数学结构发生共振的数字生命。

实验运行输出 log 日志：

🌊 H2Q-ICU Monitor Online: NVIDIA GeForce RTX 4070 Ti SUPER [Mode: Deep Analysis] [Metrics: Grad/VRAM/TPS/Diff] 🔄 恢复存档: h2q_rolling.pt 🔖 [时间之轮] 回溯至偏移量: 40.03 MB ⏳ [Init] 加载初始时间块 (Chunk T)... 🚀 启动深度监控 (Deep Monitor Active)...

📜 [Thought Stream]: They wanted to go you cose friends with a llock. He saw a balought in the grasss and laughes. He was so readys yare and granded drank he fout; " Humhe, they face and ploud need a cup tiny the close. He

📜 [Thought Stream]: They would said, "Maybe she left," she said nexck, but I'm a great stuffles in the rabbit revere." Lily smiled and said, "Ben, what no Tom. Daddy you love the askaching it was in the dog." He tried and

📜 [Thought Stream]: Tom. He asked them home in the both again. He said, "Lily, sad. He is not owl. But Let's so friend. He opened hard away. Lucy like the garden." And. She tears the pond. She said, "Bob wand. Can I see s

📜 [Thought Stream]: They had played over to splash! They got out of the jar. Tom they are really chuncog the dealichy practiced that she shock his family, he's parint the feel better. The eld barked jam. It was best addde

📜 [Thought Stream]: Timmy said, "Thank you, Mommy. I can have from calling the drees and yummy with your tail. The sound asked it if you - and a pretty slide to go for Sweepbarklesss. The End. And the floor walk in the la

📜 [Thought Stream]: The noises started to play. They played together in their train. They are angry." The sad. Lily was a snacks and lady quite. Sally lay and weere trucks to the party. She was full and her

H2Q

Hamiltonian

MicroStream

15 replies • 2026-01-27 16:32:47 +08:00

YanSeven

Dec 20, 2025 via Android

何意味

evegod

Dec 20, 2025

@YanSeven 您好，宣传自己的架构实验模型，模型是完全开源的，训练核心架构代码也是开源的哦。另外也是希望有时间的大家帮我做双盲实验验证，帮我指出错误，但是希望大家能是实际跑一跑代码确认一下效果再批评，以上代码是在 4070ti super 上本地训练的，不用消耗太多算力，而且文件集很小。

ty29022

Dec 20, 2025

找个好一点的医院看看吧

evegod

Dec 20, 2025

@ty29022 好啊，你介绍我一个好医院！

Nasdaq

PRO

Dec 20, 2025

「你知道吗，这些高速运转的机械被引入 V 站，记得我之前说过的原理吗？」

By the way: OP 认真的吗？

evegod

Dec 20, 2025

@liu731 开玩笑的，就是实验模型，我觉得有收敛效果和使用字符去直接训练没有字典层，直接涌现了类标准表达的现象挺有趣的，希望大家能感兴趣的可以复现看看，还有就是帮忙挑挑代码的错误，我自己检查怎么看怎么对啊，所以和大家分享一下，有兴趣和闲心的就当帮帮老弟我了。

evegod

Dec 20, 2025

@liu731 里面的数学结构是真实实现的，你可以 review 代码结构分析其数学实现框架，我这也是面向 Gemini 编程方法哦，其实大部分代码生成或者说代码完全通过和 Gemini 的自然语言沟通架构要求去实现后再去分析评价相关方法是否按照要求实现了，并且我也已经在离线的 win 环境下在 4070ti super 上实验了以上内容才产生的日志文件，所以我说挺有趣的一个实验模型和生成的效果，整个实验和得到结果一共才用了 4 个晚上，当然是每天都得到后半夜 4 点钟。只有晚上能安静的想事情，白天还有日子要过啊。。。

nickyadance23

Dec 22, 2025

量子编程+ICU 级仪表盘

evegod

Dec 22, 2025

@nickyadance23 你就当我是恶趣味吧，其实大部分代码是 Gemini 可以直接生成的，主要是架构跑通之后其能在没有字典层的情况下涌现正确单词和语义这个现象挺有趣的，而且也是架构预测的一个可能实现的目标指标，所以和大家分享一下。里面有详细的数学架构为什么是这样的论述。

coefu

Dec 22, 2025

@ty29022 哈哈，之前我乍一看也以为是神经病的民科，但是实际上，只是他的表达能力堪忧。他这个研究，是非常规主流路线的研究，可能和他背景偏物理和数学相关，但是，研究的过程和逻辑都是扎实的科学研究。以及方向都是有搞头的，他这个方向主要是用物理动力学和数学构建一个框架区别于经典的 transformer 的统计学框架，他对 transformer 是有深度理解的，并且从他这段时间搞的结果来看，路子是可行的，只是当前还是初始阶段，有很多 bug 待修补。大部分 cs 出身的研究者，做不到，因为背景限制，没有这些造诣，连点子都想不到。

coefu

Dec 22, 2025

@ty29022 神经网络早期研究中，玻尔兹曼机（ Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A Learning Algorithm for Boltzmann Machines. Cognitive Science.）就是这种思路。玻尔兹曼机是一种：把“学习”看成“能量最小化 + 热平衡采样”的概率模型。这也是为什么 2024 年的诺贝尔物理学奖给了辛顿。他这个 H2Q 和玻尔兹曼机在思想层面其实是“同一世界观”的不同实现。

evegod

Dec 22, 2025

@coefu 谢谢你朋友啊，我表达方式一般就是会奇奇怪怪的，我正在尝试训练更大的数据集和尝试在不做字典层的情况下通过教师模型使得该架构模型能够稳定收到逻辑对齐，但是使用方法有点像老师授课，该模型原型机很多调参都是反着直觉来的，高精度反而可以更快下降 loss ，并且其计算开销其实不大，这种特性我也在分析原因，可能是波函数的相位差抵消造成的计算致密但是结果稀疏的反常特性。有点像是求倒数的情况，我有新的东西在和大家分享，我现在是尝试能稳定一个原型机模型到可用的程度，最好的形成自我自指的认同感并且能持续学习并且持续逻辑化，我也在一直在尝试中，我倒是觉得有得干。我自己开新问答，让 Gemini 评价该项目，学术评价倒是不错的，里面的 log 文件是真实的，你有空可以改一改在本地跑一下，我慢慢意识到其实这个核心架构不耗计算，全耗子计算的那些线程加载上了。头痛中。。。

coefu

Dec 22, 2025

@evegod #12 指导打分的话，思路有点是 RL 里的 actor-critic 的思路；教师授课的话，你可以尝试知识蒸馏的路子。我不知道你熟不熟悉这些，不过我觉得比你自己从 0 开始想，或许好一些？

israinbow

Jan 25 via Android

OP 你好啊, 我纯外行问一下, 训练这种模型有 scaling law 的制约么, 或者说什么样的数据集是优质数据? 我的初创团队正在验证数据对模型或算法的影响, 目前着手点是获取更优质数据; 想请问您是否方便, 在合适的情况下开一个技术交流会? 我们想听一听更专业的意见 :)

evegod

Jan 27

@israinbow 我也是对这些东西感兴趣的个人项目，你可以尝试了解一下，我是因为需要做一些数学和物理学自动论证机的尝试在使用我自己认为符合我对人类认知结构的模型看法来构建我自己的 AGI 基础模型，其不具有最近的工程或者商业意义，需要完成的东西很多并且需要控制的自动编程不去作弊的尝试也需要非常多的分析和临时决策，所以我也不是太清楚什么时候能有个好用的版本吧。我最新的项目您可以参考看看，我也是 mit 协议完全开源的。https://github.com/makai891124-prog/H2Q-Evo

亲爱的朋友，和您分享我在语言模型核心架构上的一些尝试，如果您有时间和兴趣可以品评一下

🌌 H2Q-MicroStream: The Hamiltonian Thinking Kernel

📖 Introduction / 项目简介

🚀 Key Features / 核心特性

1. Rank-8 Essentialism (Rank-8 本质主义)

2. Hamiltonian & Quaternion Core (哈密顿与四元数核心)

3. Rolling Horizon Validation (轮动视界验证)

4. Unicode Stream (Unicode 流式读取)

📊 Performance / 实验结果

🛠️ Usage / 使用方法

1. Install Dependencies / 安装依赖

2. Run Training / 启动训练

3. Monitor / 监控

🔮 Vision / 愿景