From d4f99d5be4a9329568911270bed0e64707c1ab7d Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Tue, 12 May 2026 23:02:54 -0400 Subject: grammar: accept original SiSU bespoke @key: header dialect Until now the grammar handled only the sisudoc-spine YAML header form. Real SiSU markup uses two textually disjoint header dialects with an identical body grammar: bespoke (original SiSU, ruby): yaml (sisudoc-spine): % SiSU 4.0.0 # SiSU 8.0 @title: Alice's Adventures title: "Alice's Adventures" @creator: creator: :author: Carroll, Lewis author: "Carroll, Lewis" Add bespoke-dialect rules alongside the existing yaml ones; the body grammar is shared between the two. grammar.js: - version_comment widened from /# SiSU(spine)? / to /[#%] SiSU[^\n]*\n/. Real banners observed across both corpora include "# SiSU 8.0", "# SiSUspine 8.0", "# SiSU master 8.0", "# SiSU: http://...", "% SiSU 4.0.0", "% SiSU 0.72", "% SiSU http://...", "% SiSU markup for 0.16 and later". - Added sisu_header_field / sisu_header_key / sisu_header_value / sisu_header_continuation. - sisu_header_key whitelists the 14 @keys observed in the sisu corpus (including the @links:+ additive variant), parallel to the existing yaml header_key whitelist. - sisu_header_continuation accepts any 1+ space indented line whose first non-space character is not a newline. Covers " :sub: val", " { text }url" freeform under @links:, and 3+ space wrap-line continuations (10690 occurrences across the sisu corpus, almost all inside @classify: :topic_register: entries). - Wired sisu_header_field into _toplevel alongside header_field. queries/highlights.scm: - Added @keyword / @string captures for the new sisu_* nodes, parallel to the existing yaml header captures. test/corpus/headers_sisu.txt: - 12 new cases: % SiSU banner variants, @title: with inline value, @creator: + :author:, @date: with multiple sub-keys, @make: mixed sub-keys, @links: with freeform { text }url continuations, @links:+ additive, full bespoke header block, and a coexistence case confirming yaml + bespoke at the same top level. README.md and sisu-markup_tree-sitter.md: - Describe dual-dialect support; add sisu corpus results table. Test results: - tree-sitter test: 79 / 79 pass. - sisu-markup-samples/data/samples/ (full sisu corpus): 44 / 65 parse cleanly (was 0 / 65). current/ layout parses at 20 / 21 (95 %); the dominant failure mode is the wrapped/ layout (7 / 21) which trips the pre-existing one-line-per-paragraph limitation, not the new header rules. - sisudoc-spine-samples/markup/ (full spine corpus): 37 / 46 unchanged. No regression in the yaml dialect. Mixing the two dialects inside one document remains parseable but non-idiomatic; enforcement is left to a future linter pass rather than the grammar. (assisted by Claude-Code) --- src/node-types.json | 62 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 58 insertions(+), 4 deletions(-) (limited to 'src/node-types.json') diff --git a/src/node-types.json b/src/node-types.json index 23a8908..2d9c210 100644 --- a/src/node-types.json +++ b/src/node-types.json @@ -433,6 +433,10 @@ "type": "quote_block_tic", "named": true }, + { + "type": "sisu_header_field", + "named": true + }, { "type": "table_block_curly", "named": true @@ -788,6 +792,11 @@ ] } }, + { + "type": "header_value", + "named": true, + "fields": {} + }, { "type": "heading", "named": true, @@ -1442,6 +1451,47 @@ } } }, + { + "type": "sisu_header_field", + "named": true, + "fields": { + "key": { + "multiple": false, + "required": true, + "types": [ + { + "type": "sisu_header_key", + "named": true + } + ] + }, + "value": { + "multiple": false, + "required": false, + "types": [ + { + "type": "sisu_header_value", + "named": true + } + ] + } + }, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "sisu_header_continuation", + "named": true + } + ] + } + }, + { + "type": "sisu_header_value", + "named": true, + "fields": {} + }, { "type": "strikethrough", "named": true, @@ -1966,10 +2016,6 @@ "type": "header_key", "named": true }, - { - "type": "header_value", - "named": true - }, { "type": "heading_content", "named": true @@ -2030,6 +2076,14 @@ "type": "segment_name", "named": true }, + { + "type": "sisu_header_continuation", + "named": true + }, + { + "type": "sisu_header_key", + "named": true + }, { "type": "suppress_marker", "named": true -- cgit v1.2.3