🎉 initiate project *astro_rewrite*

2023-07-19 21:31:30 +02:00 · 2023-07-19 21:31:30 +02:00 · 2ba37bfbe3
commit 2ba37bfbe3
parent ffd4d5e86c
8658 changed files with 2268794 additions and 2538 deletions
--- a/node_modules/parse-latin/readme.md
+++ b/node_modules/parse-latin/readme.md
@ -0,0 +1,150 @@
+# parse-latin
+
+[![Build][build-badge]][build]
+[![Coverage][coverage-badge]][coverage]
+[![Downloads][downloads-badge]][downloads]
+[![Size][size-badge]][size]
+[![Chat][chat-badge]][chat]
+
+A Latin-script language parser for [**retext**][retext] producing **[nlcst][]**
+nodes.
+
+Whether Old-English (“þā gewearþ þǣm hlāforde and þǣm hȳrigmannum wiþ ānum
+penninge”), Icelandic (“Hvað er að frétta”), French (“Où sont les toilettes?”),
+`parse-latin` does a good job at tokenizing it.
+
+Note also that `parse-latin` does a decent job at tokenizing Latin-like scripts,
+Cyrillic (“Добро пожаловать!”), Georgian (“როგორა ხარ?”), Armenian (“Շատ հաճելի
+է”), and such.
+
+## Install
+
+This package is ESM only: Node 12+ is needed to use it and it must be `import`ed
+instead of `require`d.
+
+[npm][]:
+
+```sh
+npm install parse-latin
+```
+
+## Use
+
+```js
+import {inspect} from 'unist-util-inspect'
+import {ParseLatin} from 'parse-latin'
+
+const tree = new ParseLatin().parse('A simple sentence.')
+
+console.log(inspect(tree))
+```
+
+Which, when inspecting, yields:
+
+```txt
+RootNode[1] (1:1-1:19, 0-18)
+└─0 ParagraphNode[1] (1:1-1:19, 0-18)
+    └─0 SentenceNode[6] (1:1-1:19, 0-18)
+        ├─0 WordNode[1] (1:1-1:2, 0-1)
+        │   └─0 TextNode "A" (1:1-1:2, 0-1)
+        ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
+        ├─2 WordNode[1] (1:3-1:9, 2-8)
+        │   └─0 TextNode "simple" (1:3-1:9, 2-8)
+        ├─3 WhiteSpaceNode " " (1:9-1:10, 8-9)
+        ├─4 WordNode[1] (1:10-1:18, 9-17)
+        │   └─0 TextNode "sentence" (1:10-1:18, 9-17)
+        └─5 PunctuationNode "." (1:18-1:19, 17-18)
+```
+
+## API
+
+This package exports the following identifiers: `ParseLatin`.
+There is no default export.
+
+### `ParseLatin(value)`
+
+Exposes the functionality needed to tokenize natural Latin-script languages into
+a syntax tree.
+If `value` is passed here, it’s not needed to give it to `#parse()`.
+
+#### `ParseLatin#tokenize(value)`
+
+Tokenize `value` (`string`) into letters and numbers (words), white space, and
+everything else (punctuation).
+The returned nodes are a flat list without paragraphs or sentences.
+
+###### Returns
+
+[`Array.<Node>`][nlcst] — Nodes.
+
+#### `ParseLatin#parse(value)`
+
+Tokenize `value` (`string`) into an [NLCST][] tree.
+The returned node is a `RootNode` with in it paragraphs and sentences.
+
+###### Returns
+
+[`Node`][nlcst] — Root node.
+
+## Algorithm
+
+> Note: The easiest way to see **how parse-latin tokenizes and parses**, is by
+> using the [online parser demo][demo], which
+> shows the syntax tree corresponding to the typed text.
+
+`parse-latin` splits text into white space, word, and punctuation tokens.
+`parse-latin` starts out with a pretty easy definition, one that most other
+tokenizers use:
+
+*   A “word” is one or more letter or number characters
+*   A “white space” is one or more white space characters
+*   A “punctuation” is one or more of anything else
+
+Then, it manipulates and merges those tokens into a ([nlcst][]) syntax tree,
+adding sentences and paragraphs where needed.
+
+*   Some punctuation marks are part of the word they occur in, such as
+    `non-profit`, `she’s`, `G.I.`, `11:00`, `N/A`, `&c`, `nineteenth- and…`
+*   Some full-stops do not mark a sentence end, such as `1.`, `e.g.`, `id.`
+*   Although full-stops, question marks, and exclamation marks (sometimes) end a
+    sentence, that end might not occur directly after the mark, such as `.)`,
+    `."`
+*   And many more exceptions
+
+## License
+
+[MIT][license] © [Titus Wormer][author]
+
+<!-- Definitions -->
+
+[build-badge]: https://github.com/wooorm/parse-latin/workflows/main/badge.svg
+
+[build]: https://github.com/wooorm/parse-latin/actions
+
+[coverage-badge]: https://img.shields.io/codecov/c/github/wooorm/parse-latin.svg
+
+[coverage]: https://codecov.io/github/wooorm/parse-latin
+
+[downloads-badge]: https://img.shields.io/npm/dm/parse-latin.svg
+
+[downloads]: https://www.npmjs.com/package/parse-latin
+
+[size-badge]: https://img.shields.io/bundlephobia/minzip/parse-latin.svg
+
+[size]: https://bundlephobia.com/result?p=parse-latin
+
+[chat-badge]: https://img.shields.io/badge/join%20the%20community-on%20spectrum-7b16ff.svg
+
+[chat]: https://spectrum.chat/unified/retext
+
+[npm]: https://docs.npmjs.com/cli/install
+
+[demo]: https://wooorm.com/parse-latin/
+
+[license]: license
+
+[author]: https://wooorm.com
+
+[retext]: https://github.com/retextjs/retext
+
+[nlcst]: https://github.com/syntax-tree/nlcst