Tech

Syntax & Circuits · Code and experiments bridging syntax/semantics and computation

Projects

Small, well-documented projects that I actually use. Code and experiments bridging syntax/semantics and computation.

nlp-toys

数据清洗、标注脚本、可复用小工具

A collection of practical NLP utilities for text preprocessing, annotation, and analysis. Built with Python and designed for research workflows.

# Example usage python nlp_toys/text_cleaner.py --input data.txt --output clean.txt python nlp_toys/annotation_helper.py --corpus corpus.json --scheme BIO

langlab-notebooks

语料处理与可视化

Jupyter notebooks for corpus analysis, linguistic visualization, and experimental data processing. Interactive tools for exploring language patterns.

learning-tools

语言学习打卡后端/前端原型

Language learning tracking system with streak visualization, vocabulary management, and progress analytics. Built with Node.js and React.

Python JavaScript Node.js React Jupyter spaCy NLTK TensorFlow PyTorch Docker Git REST APIs

Tools & Libraries

Open-source contributions and reusable components for computational linguistics research.

Corpus Analysis Toolkit

Python library for large-scale corpus processing with support for multiple languages and annotation schemes.

Linguistic Visualization Suite

Interactive visualization tools for syntax trees, semantic networks, and linguistic feature distributions.

Experiments

Code and experiments bridging syntax/semantics and computation.

Semantic Change Detection

Vector space models for detecting diachronic semantic drift in large-scale corpora. Tools: Python, gensim, spaCy.

Agreement Attraction Analysis

Computational models for studying agreement attraction phenomena in natural language processing.