ãã®ææžã®å 容㯠Feishu ææžããã³ããŒããããã®ã§ãããæ€çŽ¢çšã«ååšããŸããå 容ã®ãã©ãŒããããäžé©åãªå ŽåããããŸãã®ã§ãå ã®Feishu ææžãåç §ããããšããå§ãããŸãã
èæ¯#
æè¿ãmbrubeckæ°ãæžããRobinsonã«åŸã£ãŠãRust ã䜿ã£ãŠã·ã³ãã«ãªãã©ãŠã¶ãšã³ãžã³ãäœæããããšãåŠãã§ããŸãïŒåŸã§å®æãããææžãäœæããŠç޹ä»ããŸãïŒããã®éçšã§ãHTMLãCSS ãªã©ã®ãã©ãŒããããã¡ã€ã«ãè§£æããå¿ èŠããããããé¢é£ããããŒãµãŒãäœæããå¿ èŠããããŸãã
0 ãã 1 ãŸã§ææžãã®ããŒãµãŒãäœæããããšã¯éåžžã«éå±ã§ãã¹ãèµ·ãããããäœæ¥ã§ããå ·äœçã«è§£æããå¿ èŠããããããã³ã«ã®ã«ãŒã«ãèæ ®ããã ãã§ãªããããŒãµãŒã®ãšã©ãŒãã³ããªã³ã°ãæ¡åŒµæ§ãè§£ææ§èœãªã©ãèæ ®ããå¿ èŠããããŸãããã®ãããèšäºã®äžã§ããçŸåšååšãã pest ãªã©ã®ãµãŒãããŒãã£ã©ã€ãã©ãªã䜿çšããŠæé©åããããšããå§ãããŠããŸãã
ç§ã®æ¥åžžã®éçºäœæ¥ãæ¯ãè¿ããšãããŒãµãŒãæ§ç¯ããå¿ èŠãããã·ãŒã³ã¯å°ãªããéåžžã¯ãJSONãCSV ãªã©ã®ãã©ãŒãããããããã³ã«ã®æ å ±ãè§£æããå¿ èŠãããå Žåãå¹çã远æ±ããŠããã®ãã©ãŒãããããããã³ã«ã«ç¹åãããµãŒãããŒãã£ã©ã€ãã©ãªãçŽæ¥äœ¿çšããŸããããããå®éã«ã¯ãã¹ãŠã®ãããã³ã«ããã©ãŒãããã«ä»ã®äººãæžããããŒãµãŒãããããã§ã¯ãªããç¹ã«ããŸããŸãªãããã¯ãŒã¯éä¿¡ãããã³ã«ãªã©ã«é¢ããŠã¯ãæžãããããŒãµãŒãã«ã¹ã¿ãã€ãºãé£ããããããã®æ©äŒãå©çšããŠãããŒãµãŒãçè§£ããããŒãµãŒãæ§ç¯ããé¢é£å®è·µãåŠã³ãä»åŸåæ§ã®ã·ãŒã³ã§ã®äœ¿çšã䟿å©ã«ããããšãã§ããã°ãšæããŸãã
泚æïŒ
- ãã®ææžã§ã¯ãããŒãµãŒãšããŒãµãŒã©ã€ãã©ãªã®é¢é£åçãæ·±ãæãäžããããšã¯ãããŸããããããŒãµãŒå ¥éã®çè§£ãšå®è·µãã«çŠç¹ãåœãŠãŠããŸãã
- ãã®ã·ãªãŒãºã®ææžã¯äžäžäºç¯ã«åãããŠãããäžç¯ã§ã¯ããŒãµãŒã®çè§£ãšãµãŒãããŒãã£ã©ã€ãã©ãªã®äœ¿çšã«ã€ããŠãäžç¯ã§ã¯å ·äœçãªå®è·µã«ã€ããŠèª¬æããŸãã
- ãã®ã·ãªãŒãºã®ææžã«ç»å ŽãããœãŒã¹ã³ãŒãã¯ãhttps://github.com/catwithtudou/parser_toy ã§ç¢ºèªã§ããŸãã
åæç¥è#
以äžã§ã¯ãããŒãµãŒã«é¢é£ããåæç¥èã玹ä»ããåŸã®çè§£ãå©ããŸãã
ããŒãµãŒ#
ããã§èšãããŒãµãŒïŒParserïŒã¯ãå®éã«ã¯ããåºãå®çŸ©ãæããéåžžã¯ç¹å®ã®ãã©ãŒãããã®æ å ±ãç¹å®ã®ããŒã¿æ§é ã«å€æããã³ã³ããŒãã³ããæããŸãã
ãããã©ãŒãããã®æ å ±ãããã³ãŒããããæŽçãããããŒã¿æ§é æ å ±ã«æœè±¡åããããšã§ãæ å ±ã®çè§£ãšåŠçã容æã«ããŸãã
äŸãæãããšð°ïŒç®æ°ã®è¡šçŸåŒã®ããã¹ã "1 + 2" ããããããã°ã©ã ãéããŠçµæãèšç®ã§ããããšãæåŸ ããŠããŸãã
ããã°ã©ã ãç®è¡è¡šçŸãèªèã§ããããã«ããããã«ãç®è¡è¡šçŸã®ããŒãµãŒãä»ã㊠(left,op,right) ã®æ§é äœã«å€æããŠèšç®ãè¡ããŸãã
ã³ã³ãã¥ãŒã¿åéã«ãããŠãããŒã¿åŠçã®éçšã§ããŒãµãŒã¯äžå¯æ¬ ã§ãããããŸããŸãªããŒã¿åŠçã·ãŒã³ã§å¿çšã§ããŸããäŸãã°ãäžè¬çãªäŸãšããŠïŒ
- äœã¬ãã«ã®ã³ã³ãã€ã©ãã€ã³ã¿ããªã¿ã®ããŒãµãŒã¯ãäž»ã«ãœãŒã¹ã³ãŒããæ§æè§£æããæœè±¡æ§ææš AST ãæœåºãã圹å²ãæãããŸãã
- Web ã¢ããªã±ãŒã·ã§ã³ã§å€ã䜿çšãããããŒã¿äº€æãã©ãŒããã JSON ã®ããã¹ãã¯ã察å¿ããããŒãµãŒãä»ããŠå¿ èŠãªããŒã¿æ§é ãã·ãªã¢ã©ã€ãºããŠåŠçãããŸãã
- ãã®ä»ããããã¯ãŒã¯éä¿¡ãããã³ã«ãã¹ã¯ãªããèšèªãããŒã¿ããŒã¹èšèªãªã©ãè§£æããããã®ããŒãµãŒã䜿çšããŸãã
PEG#
PEGïŒParsing Expression GrammarïŒã玹ä»ããåã«ãããã§ã¯ïŒä»®å®ãšããŠïŒããäžè¬çãªæ£èŠè¡šçŸã䜿ã£ãŠçè§£ãæ·±ããŸãã
æ£èŠè¡šçŸãš PEG ã®é¢ä¿ã¯ãæåããã¹ããåŠçããéã«ç¹å®ã®æ§æã䜿çšããŠæåããã¹ãããããã³ã°ããã³è§£æã§ããããšã§ãããç°ãªãç¹ã¯æ¬¡ã®ãšããã§ãïŒ
- ãæ§æé¢ãåè ã¯æååã®ãã¿ãŒã³ãèšè¿°ããããã«ç¹å®ã®æ§æã䜿çšããéåžžã¯åçŽãªæååãããã³ã°ãæ€çŽ¢ã«äœ¿çšãããŸããäžæ¹ãåŸè ã¯ããè€éãªæ§æãèšè¿°ããããã®æ§æã䜿çšããéåžžã¯è€éãªèšèªè§£æãåæã®ããŒãºã«äœ¿çšãããŸãã
- ãå¿çšåéãåè ã¯äž»ã«åçŽãªããã¹ãåŠçã®ããŒãºã«äœ¿çšãããç¹å®ã®ãã¿ãŒã³ã®ããã¹ããæ€çŽ¢ããããå ¥åã®ãã©ãŒããããæ€èšŒãããããŸããäžæ¹ãåŸè ã¯è€éãªèšèªæ§é ã®åŠçã«äž»ã«äœ¿çšãããããã°ã©ãã³ã°èšèªã®æ§æè§£æãã€ã³ã¿ããªã¿ã®æ§ç¯ãªã©ã«äœ¿çšãããŸãã
玹ä»ãéããŠãçãã㯠PEG ã«ã€ããŠç°¡åã«çè§£ã§ãããšæããŸãã
ãªã PEG ã玹ä»ããã®ããšãããšãPEG ã䜿çšããŠå®çŸã§ããããŒã«ïŒããŒãµãŒãžã§ãã¬ãŒã¿ãŒãšåŒã°ããïŒãéããŠã«ã¹ã¿ãã€ãºãããããŒãµãŒãå®çŸã§ããããã§ãã
次ã«ãPEGïŒParsing Expression GrammarïŒãæ£åŒã«ç޹ä»ããŸãïŒ
- PEGïŒParsing Expression GrammarïŒæŠèŠ
Parsing Expression Grammarãç¥ããŠPEGïŒè±èªïŒParsing Expression GrammarïŒïŒ
- è§£æåã®åœ¢åŒææ³ã®äžçš®ã§ã2004 幎㫠Bryan Ford ã«ãã£ãŠæå±ããã1970 幎代ã«å°å ¥ããããããããŠã³æ§æè§£æèšèªã®ãã¡ããªãŒã«é¢é£ããŠããŸãã
- èšèªæ§é ãèšè¿°ããããã®æ§æãšããŠãæ£èŠè¡šçŸãããè€éãªèšèªæ§é ãåŠçã§ããååž°çãªç¹æ§ã«ããç¡éã«ãã¹ããããæ§é ãèšè¿°ã§ããŸãã
- ã·ã³ãã«ã§æè»ãªæ¹æ³ã§æ§æã«ãŒã«ãå®çŸ©ã§ãããã®ã«ãŒã«ã䜿çšããŠå ¥åæååãè§£æããæ§ææšãçæã§ããŸãã
- 䜿ãããããæ£ç¢ºæ§ãæ§èœã®å©ç¹ãããããšã©ãŒã¬ããŒããåå©çšå¯èœãªã«ãŒã«ãã³ãã¬ãŒããªã©ã®æ©èœãæäŸãããããããã¹ãã®è§£æãåæã«åºã䜿çšãããŠããŸãã
- PEG ã®å¿çšæŠèŠ
PEG ã®æ§æã¯ããã°ã©ãã³ã°èšèªã«äŒŒãŠãããæŒç®åãšã«ãŒã«ã䜿çšããŠèšèªæ§é ãèšè¿°ããŸãïŒ
-
æŒç®åã«ã¯ã|ãïŒãŸãã¯ïŒãã&ãïŒããã³ïŒããïŒãïŒãªãã·ã§ã³ïŒãªã©ãå«ãŸããã«ãŒã«ã¯èšèªã®å ·äœçãªæ§é ãèšè¿°ããããã«äœ¿çšãããŸãã
-
äŸãã°ã以äžã¯æŽæ°ã®æ§æãèšè¿°ããã·ã³ãã«ãª PEG ã«ãŒã«ã§ãïŒ
int := [0-9]+
å¹ççãªããŒãµãŒã³ãŒãã«çŽæ¥å€æã§ãããããçŸåšå€ãã®äœã¬ãã«ã§ PEG ã䜿çšããŠå®çŸãããããŒãµãŒãååšããŸããäŸãã°ãANTLRãPEG.js ãªã©ã§ãã
ããŒãµãŒã³ã³ãããŒã¿ãŒ#
åè¿°ã®ããŒãµãŒã«ã€ããŠã®çè§£ãéããŠãããŒãµãŒã³ã³ãããŒã¿ãŒïŒParser CombinatorïŒãçè§£ããã®ã¯æ¯èŒç容æã§ãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒã®å®çŸ©ãšææ³
ç°¡åã«èšããšãããŒãµãŒã³ã³ãããŒã¿ãŒã¯ããŸããŸãªããŒãµãŒã³ã³ããŒãã³ããçµã¿åãããŠæ§ç¯ãããã³ã³ããŒãã³ãã§ãã
ããŒãµãŒã³ã³ãããŒã¿ãŒã®èãæ¹ã¯ãœãããŠã§ã¢å·¥åŠã«éåžžã«åèŽããŠããã颿°ã®çµã¿åããã«åºã¥ããŠããŒãµãŒãæ§ç¯ããæè¡ã§ãããå°ãããåå©çšå¯èœã§ããã¹ãå¯èœãªããŒãµãŒã³ã³ããŒãã³ããçµã¿åãããŠè€éãªããŒãµãŒãæ§ç¯ããŸãããã®æ¹æ³ã«ãããããŒãµãŒã®æ§ç¯ãããæè»ã§æ¡åŒµå¯èœã«ãªããéçºå¹çãå€§å¹ ã«åäžããä»åŸã®ã¡ã³ããã³ã¹ã容æã«ãªããŸãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒãšããŒãµãŒãžã§ãã¬ãŒã¿ãŒ
ããŒãµãŒã³ã³ãããŒã¿ãŒã¯ãåè¿°ã®ããŒãµãŒãžã§ãã¬ãŒã¿ãŒãšå¹³è¡ã®æŠå¿µã§ããããã§äŸãæããŸãïŒ
- å®çŸãããããŒãµãŒïŒäŸãã° JSON ããŒãµãŒïŒã倧ããªãã«ãšèŠãªããšã
- ããŒãµãŒãžã§ãã¬ãŒã¿ãŒã䜿çšããŠæ§ç¯ãããšãæ¯åã»ãŒãŒããããã«ãæ§ç¯ããããšã«ãªãããã«ãšãã«ã®éã®é¡äŒŒéšåïŒäŸãã°ãã¢ãçªïŒã¯åå©çšã§ããŸããã
- äžæ¹ãããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšãããšãã¬ãŽãããã¯ãçµã¿ç«ãŠãããã«å°ãããåå©çšå¯èœã§ããã¹ãå¯èœãªã³ã³ããŒãã³ããæ§ç¯ãããããã®ã³ã³ããŒãã³ãã䜿çšããŠãã«ãæ§ç¯ããŸããæ°ãããã«ãæ§ç¯ããéã«ã¯ã以åã«äœæããã³ã³ããŒãã³ãã䜿çšã§ããããéåžžã«äŸ¿å©ã§ãããŸããããŒãµãŒã«åé¡ãçºçããå Žåãç¹å®ã®ã³ã³ããŒãã³ãã容æã«ç¹å®ã§ããä»åŸã®ã¡ã³ããã³ã¹ã䟿å©ã§ãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒãš PEG ã䜿çšããŠå®çŸãããããŒãµãŒãžã§ãã¬ãŒã¿ãŒ
ã衚çŸé¢ãããŒãµãŒã³ã³ãããŒã¿ãŒã¯è¡šçŸèœåãããæè»ã§ãããã°ã©ãã³ã°èšèªã®ç¹æ§ãçŽæ¥äœ¿çšããŠããŒãµãŒãçµã¿åãããŠå®çŸ©ã§ããŸããäžæ¹ãPEG ã䜿çšããŠå®çŸãããããŒãµãŒãžã§ãã¬ãŒã¿ãŒã¯ãç¹å®ã®æ§æã«ãŒã«ã䜿çšããŠããŒãµãŒãèšè¿°ãã衚çŸèœåã¯æ§æã«ãŒã«ã«å¶çŽãããŸããã€ãŸããããŒãµãŒãžã§ãã¬ãŒã¿ãŒã®ã€ã³ã¿ãŒãã§ãŒã¹ã䜿çšããã ãã§ãªããPEG ã®æ§æã«ãŒã«ãç¿åŸããå¿ èŠããããŸãã
ãæ§èœé¢ãããŒãµãŒã³ã³ãããŒã¿ãŒãšããŒãµãŒãžã§ãã¬ãŒã¿ãŒã®æ§èœæ¯èŒã¯ãå ·äœçãªå®è£ ãšäœ¿çšã·ãŒã³ã«äŸåããŸããããããäžè¬çã«åºå±€åçããèŠããšãããŒãµãŒãžã§ãã¬ãŒã¿ãŒã¯éåžžå¹ççãªããŒãµãŒã³ãŒããçæããŸãããããã£ãŠãå€§èŠæš¡ãªæ§æãè€éãªå ¥åãåŠçããå Žåãããè¯ãæ§èœãæã€å¯èœæ§ããããŸããäžæ¹ãããŒãµãŒã³ã³ãããŒã¿ãŒã¯éåžžãå®è¡æã«åçã«ããŒãµãŒãçµã¿åããããããäžå®ã®æ§èœãªãŒããŒããããçºçããŸãã
ããããçŸåš Rust ã§ã¯ãããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšããŠå®çŸããã nom ãšãPEG ã䜿çšããŠå®çŸããã pest ã®éã§ãåè ã®æ¹ãæ§èœãé«ãã§ãã
Rust ããŒãµãŒã©ã€ãã©ãª#
以äžã§ã¯ãRust ã§ããŒãµãŒãå®çŸããããã®å€å žçãªãµãŒãããŒãã£ã©ã€ãã©ãªãPEG ããŒã¹ã® Pest ãšããŒãµãŒã³ã³ãããŒã¿ãŒã® Nomã玹ä»ããŸãã
pest#
æŠèŠ#
Pest ã¯ãRust ã§æžãããæ±çšããŒãµãŒã§ãããå¯çšæ§ãæ£ç¢ºæ§ãæ§èœã«éç¹ã眮ããŠããŸããåè¿°ã®PEG ãå ¥åãšããŠäœ¿çšããè€éãªèšèªãè§£æããããã«å¿ èŠãªåŒ·åããã衚çŸèœåãæäŸããåæã«ã·ã³ãã«ã§åªé ãªæ¹æ³ã§ã«ã¹ã¿ã ããŒãµãŒãæ§ç¯ããããã®å®çŸ©ãšçæã容æã«ããŸãã
èªåçæã®ãšã©ãŒã¬ããŒããderive 屿§ãä»ããŠããŒãµãŒãã¬ã€ãã®å®è£ ãèªåçæãåäžãã¡ã€ã«å ã§è€æ°ã®ããŒãµãŒãå®çŸ©ã§ãããªã©ã®æ©èœãåããŠããŸãã
䜿çšäŸ#
- cargo.toml ã« pest äŸåé¢ä¿ã远å
[dependencies]
pest = "2.6"
pest_derive = "2.6"
- æ°ãã
src/grammar.pest
ãã¡ã€ã«ãäœæããè§£æåŒã®æ§æãèšè¿°
ããã§ã®æ§æã¯ããã£ãŒã«ãã®è§£æã«ãŒã«ã瀺ããŠãããåæå㯠ASCII æ°åã§ãããå°æ°ç¹ãšè² å·ãå«ãããšãã§ãã+
ã¯ãã®ãã¿ãŒã³ãè€æ°ååºçŸããããšã瀺ããŸãã
field = { (ASCII_DIGIT | "." | "-")+ }
- æ°ãã
src/parser.rs
ãã¡ã€ã«ãäœæããããŒãµãŒãå®çŸ©
以äžã®ã³ãŒãã¯ãParser ãšããæ§é äœãå®çŸ©ããæŽŸçãã¯ããä»ããŠïŒæ¯åã³ã³ãã€ã«æã«ïŒææ³ãã¡ã€ã«ã®ãã¿ãŒã³ãæºããããŒãµãŒãèªåçã«å®è£ ããŸãã
use pest_derive::Parser;
#[derive(Parser)]
#[grammar = "grammer.pest"]
pub struct Parser;
// ãã®ãã¡ã€ã«ãã³ã³ãã€ã«ãããã³ã«ãpestã¯grammarãã¡ã€ã«ã䜿çšããŠãã®ãããªé
ç®ãèªåçæããŸã
#[cfg(test)]
mod test {
use std::fs;
use pest::Parser;
use crate::{Parser, Rule};
#[test]
pub fn test_parse() {
let successful_parse = Parser::parse(Rule::field, "-273.15");
println!("{:?}", successful_parse);
let unsuccessful_parse = Parser::parse(Rule::field, "China");
println!("{:?}", unsuccessful_parse);
}
}
å ·äœçãªäœ¿çš#
ããŒãµãŒ API#
pest ã¯ãæåããè§£æçµæã«ã¢ã¯ã»ã¹ããããã®ããŸããŸãªæ¹æ³ãæäŸããŠããŸãã以äžã®æ§æäŸã«åŸã£ãŠãã®æ¹æ³ã玹ä»ããŸãïŒ
number = { ASCII_DIGIT+ } // 1ã€ä»¥äžã®10鲿°å
enclosed = { "(.." ~ number ~ "..)" } // äŸãã°ã"(..1024..)"
sum = { number ~ " + " ~ number } // äŸãã°ã"1024 + 12"
- ããŒã¯ã³
pest ã¯ãæåã瀺ãããã«ããŒã¯ã³ã䜿çšããŸããã«ãŒã«ãããããããã³ã«ããããããéå§äœçœ®ãšçµäºäœçœ®ã瀺ã 2 ã€ã®ããŒã¯ã³ãçæãããŸããäŸãã°ïŒ
"3130 abc"
| ^ end(number)
^ start(number)
çŸåšãrustrover ã«ã¯ pest 圢åŒããµããŒããããã©ã°ã€ã³ããããã«ãŒã«ãæ€èšŒããããããŒã¯ã³ã衚瀺ãããããæ©èœããããŸãã
- ãã¹ããããã«ãŒã«
ããåœåã«ãŒã«ãå¥ã®åœåã«ãŒã«ãå«ãå Žåãäž¡è ã®ããŒã¯ã³ãçæãããŸãã以äžã®ããã«ãäž¡è ã®ããŒã¯ã³ãçæãããŸãïŒ
"(..6472..)"
| | | ^ end(enclosed)
| | ^ end(number)
| ^ start(number)
^ start(enclosed)
åæã«ãç¹å®ã®ã·ãŒã³ã§ã¯ãããŒã¯ãç°ãªãæåäœçœ®ã«çŸããªãå ŽåããããŸãïŒ
"1773 + 1362"
| | | ^ end(sum)
| | | ^ end(number)
| | ^ start(number)
| ^ end(number)
^ start(number)
^ start(sum)
- ã€ã³ã¿ãŒãã§ãŒã¹
ããŒã¯ã³ã¯ Token enum 圢åŒã§å ¬éããããã® enum ã«ã¯ Start ãš End ã®ããªã¢ã³ãããããè§£æçµæã«å¯Ÿã㊠tokens ãåŒã³åºãããšã§ã€ãã¬ãŒã¿ãååŸã§ããŸãïŒ
let parse_result = DemoParser::parse(Rule::sum, "1773 + 1362").unwrap();
let tokens = parse_result.tokens();
for token in tokens {
println!("{:?}", token);
}
- ãã¢
ãããããããŒã¯ã®ãã¢ãèæ ®ããŠè§£æããªãŒãæ¢çŽ¢ããå Žåãpest 㯠Pair åãæäŸããŠããã以äžã®ããã«äœ¿çšãããŸãïŒ
- ã©ã®ã«ãŒã«ããã¢ãçæããããç¹å®ãã
- ãã¢ãå ã® & str ãšããŠäœ¿çšãã
- ãã¢ãçæããå éšåœåã«ãŒã«ã確èªãã
let pair = DemoParser::parse(Rule::enclosed, "(..6472..) and more text")
.unwrap().next().unwrap();
assert_eq!(pair.as_rule(), Rule::enclosed);
assert_eq!(pair.as_str(), "(..6472..)");
let inner_rules = pair.into_inner();
println!("{}", inner_rules); // --> [number(3, 7)]
ãã¢ã¯ä»»æã®æ°ã®å éšã«ãŒã«ãæã€ããšãã§ããPair::into_inner () ã䜿çšã㊠Pairs ãè¿ããåãã¢ã®ã€ãã¬ãŒã¿ãååŸã§ããŸãïŒ
let pairs = DemoParser::parse(Rule::sum, "1773 + 1362")
.unwrap().next().unwrap()
.into_inner();
let numbers = pairs
.clone()
.map(|pair| str::parse(pair.as_str()).unwrap())
.collect::<Vec<i32>>();
assert_eq!(vec![1773, 1362], numbers);
for (found, expected) in pairs.zip(vec!["1773", "1362"]) {
assert_eq!(Rule::number, found.as_rule());
assert_eq!(expected, found.as_str());
}
- Parse ã¡ãœãã
掟çãã Parser ã¯ãResult<Paris,Error> ãè¿ã parse ã¡ãœãããæäŸããåºå±€ã®è§£æããªãŒã«ã¢ã¯ã»ã¹ããã«ã¯ãçµæã match ãŸã㯠unwrap ããå¿ èŠããããŸãïŒ
// è§£æãæåãããã©ããã確èª
match Parser::parse(Rule::enclosed, "(..6472..)") {
Ok(mut pairs) => {
let enclosed = pairs.next().unwrap();
// ...
}
Err(error) => {
// ...
}
}
è§£æåŒã®æ§æ#
PEG ã®åºæ¬çãªè«çã¯éåžžã«ã·ã³ãã«ã§çŽæ¥çã§ããã以äžã® 3 ã€ã®ã¹ãããã«èŠçŽã§ããŸãïŒ
- ã«ãŒã«ã®ãããã詊ã¿ã
- æåããå Žåãæ¬¡ã®ã¹ãããã詊ã¿ã
- 倱æããå Žåãå¥ã®ã«ãŒã«ã詊ã¿ã
ãã®æ§æã®ç¹åŸŽã¯ä»¥äžã® 4 ç¹ã§ãïŒ
- 貪欲æ§
å ¥åæååäžã§ç¹°ãè¿ã PEG åŒãå®è¡ããå Žåã貪欲ã«ïŒã§ããã ãå€ãïŒåŒãå®è¡ãããã®çµæã¯ä»¥äžã®ããã«ãªããŸãïŒ
- ããããæåããå Žåããããããå å®¹ãæ¶è²»ããæ®ãã®å ¥åãè§£æåšã®æ¬¡ã®ã¹ãããã«æž¡ããŸãã
- ãããã倱æããå Žåãäœã®æåãæ¶è²»ããããã®å€±æãäŒæããæçµçã«è§£æã倱æããŸãã倱æãäŒæäžã«ææãããªãéãã
// åŒ
ASCII_DIGIT+ // 1ã€ä»¥äžã®'0'ãã'9'ã®æå
// ãããããã»ã¹
"42 boxes"
^ Running ASCII_DIGIT+
"42 boxes"
^ Successfully took one or more digits!
" boxes"
^ Remaining unparsed input.
- é åºä»ãéžæ
æ§æã«ã¯é åºä»ãéžææŒç®å|
ãååšããäŸãã°one|two
ã¯ãæåã«åè
one ã詊ã¿ã倱æããå Žåã«åŸè
two ã詊ã¿ãŸãã
é åºãèŠæ±ãããå Žåãã«ãŒã«ãåŒã®äžã§é 眮ããäœçœ®ã«æ³šæãå¿ èŠã§ããäŸãã°ïŒ
- åŒ
"a"|"ab"
ã§ã¯ãæåå "abc" ããããããããšãåã®ã«ãŒã«"a"
ã«ãããããåŸãåŸã® "bc" ãè§£æããŸããã
ãããã£ãŠãéžæçãªããŒãµãŒãæžãéã«ã¯ãæãé·ããŸãã¯æãå ·äœçãªéžæãåã«çœ®ããæãçããŸãã¯æãäžè¬çãªéžæãæåŸã«çœ®ãããšãäžè¬çã§ãã
- éããã¯ãã©ããã³ã°
è§£æããã»ã¹ã§ã¯ãåŒã¯æåããã倱æãããã®ããããã§ãã
æåããå Žåã¯æ¬¡ã®ã¹ãããã«é²ã¿ã倱æããå Žåã¯åŒã倱æãããšã³ãžã³ã¯åŸéããŠå詊è¡ããããšã¯ãããŸãããããã¯ãããã¯ãã©ããã³ã°æ©èœãæã€æ£èŠè¡šçŸãšã¯ç°ãªããŸãã
äŸãã°ã以äžã®äŸïŒããã§~
ã¯ãã®åŒã®åã®ã«ãŒã«ããããããåŸã«è¡ãããæ¬¡ã®ã¹ãããã瀺ããŸãïŒïŒ
word = { // åèªãèªèããããã«...
ANY* // ä»»æã®æåã0å以äžååŸ...
~ ANY // ä»»æã®æåã®åŸ
}
"frumious"
æåå "frumious" ããããããããšãANY*
ã¯æåã«æååå
šäœãæ¶è²»ããæ¬¡ã®ã¹ãããANY
ã¯äœããããããªããããè§£æã倱æããŸãã
"frumious"
^ (word)
"frumious"
^ (ANY*) Success! Continue to ANY with remaining input "".
""
^ (ANY) Failure! Expected one character, but found end of string.
ãã®ãããªã·ãŒã³ã§ã¯ãããã¯ãã©ããã³ã°æ©èœãæã€ã·ã¹ãã ïŒæ£èŠè¡šçŸãªã©ïŒã§ã¯ã1 æååŸéãããåãåºããŠãå詊è¡ããŸãã
- ææ§ãããªã
PEG ã®åã«ãŒã«ã¯ãå
¥åæååã®æ®ãã®éšåã§å®è¡ãããã§ããã ãå€ãã®å
¥åãæ¶è²»ããŸããäžåºŠã«ãŒã«ãå®äºãããšãæ®ãã®å
¥åã¯è§£æåšã®ä»ã®éšåã«æž¡ãããŸããäŸãã°ãåŒASCII_DIGIT+
㯠1 ã€ä»¥äžã®æ°åãããããããåžžã«æå€§ã®é£ç¶æ°åã·ãŒã±ã³ã¹ãããããããŸããæå³ããªã圢ã§åŸã®ã«ãŒã«ãããã¯ãã©ãã¯ããããšã¯ãªããçŽæçã§é屿çãªæ¹æ³ã§ããã€ãã®æ°åãçããªã©ã®å±éºãªç¶æ³ã¯çºçããŸããã
ããã¯ãæ£èŠè¡šçŸã CFG ãªã©ã®ä»ã®è§£æããŒã«ãšã¯å¯Ÿç §çã§ããããããã®ããŒã«ã§ã¯ã«ãŒã«ã®çµæããã°ãã°è·é¢ã®ããã³ãŒãã«äŸåããŸãã
ããŒãµãŒã®æ§æãšçµã¿èŸŒã¿ã«ãŒã«#
- éèŠãªæ§æ
pest ã®æ§æã¯æ£èŠè¡šçŸã«æ¯ã¹ãŠå°ãªãã以äžã«äž»èŠãªæ§æãšãã®æå³ãç°¡åã«ç€ºããŸãã詳现ã«ã€ããŠã¯èªåã§æ€çŽ¢ããŠãã ããïŒ
æ§æ | æå³ | æ§æ | æå³ |
---|---|---|---|
foo = { ... } | éåžžã®ã«ãŒã« | baz = @{ ... } | ååç |
bar = _{ ... } | ãµã€ã¬ã³ã | qux = ${ ... } | è€åååç |
#tag = ... | ã¿ã° | plugh = !{ ... } | éååç |
"abc" | æ£ç¢ºãªæåå | ^"abc" | 倧æåå°æåãåºå¥ããªã |
'a'..'z' | æåç¯å² | ANY | ä»»æã®æå |
foo ~ bar | ã·ãŒã±ã³ã¹ | `baz | qux` |
foo* | 0 åä»¥äž | bar+ | 1 åä»¥äž |
baz? | ãªãã·ã§ã³ | qux{n} | ã¡ããã© n å |
qux{m, n} | m åãã n åïŒå«ãïŒ | ||
&foo | è¯å®çè¿°èª | ||
PUSH(baz) | ãããããŠããã·ã¥ | !bar | åŠå®çè¿°èª |
POP | ãããããŠããã | ||
DROP | ãããããã«ããã | PEEK | ãããããã«ããã |
PEEK_ALL | ã¹ã¿ãã¯å šäœãããã |
- çµã¿èŸŒã¿ã«ãŒã«
ANY
ã®ä»ã«ãpest ã¯éåžžã«å€ãã®çµã¿èŸŒã¿ã«ãŒã«ãæäŸããããã¹ãã®è§£æããã䟿å©ã«ããŸããããã§ã¯äž»ã«ããã€ãã®äžè¬çãªã«ãŒã«ã瀺ããŸãã詳现ã¯èªåã§ç¢ºèªããŠãã ããïŒ
çµã¿èŸŒã¿ã«ãŒã« | åçã®æå³ | çµã¿èŸŒã¿ã«ãŒã« | åçã®æå³ |
---|---|---|---|
ASCII_DIGIT | '0'..'9' | ASCII_ALPHANUMERIC | æ°åãŸãã¯æåã®ãããã `ASCII_DIGIT |
UPPERCASE_LETTER | 倧æå | NEWLINE | ä»»æã®æ¹è¡åœ¢åŒ `"\n" |
LOWERCASE_LETTER | å°æå | SPACE_SEPARATOR | 空çœåºåã |
MATH_SYMBOL | æ°åŠèšå· | EMOJI | çµµæå |
nom#
æŠèŠ#
nom ã¯ãåè¿°ã®ããŒãµãŒã³ã³ãããŒã¿ãŒïŒParser CombinatorïŒã©ã€ãã©ãªã§ãRust ã§æžãããŠããŸãã以äžã®ç¹æ§ããããŸãïŒ
- ã¹ããŒããã¡ã¢ãªæ¶è²»ã«åœ±é¿ãäžããã«å®å šãªããŒãµãŒãæ§ç¯ããŸãã
- Rust ã®åŒ·åãªåã·ã¹ãã ãšã¡ã¢ãªå®å šæ§ãå©çšããŠãæ£ç¢ºãã€å¹ççãªããŒãµãŒãçæããŸãã
- 颿°ããã¯ããç¹åŸŽãæäŸãããšã©ãŒãçºçãããããã€ãã©ã€ã³ã®å€§éšåãæœè±¡åããåæã«ããŒãµãŒãçµã¿åãããŠè€éãªããŒãµãŒãæ§ç¯ããããšã容æã«ãªããŸãã
nom ã¯éåžžã«åºç¯ãªã¢ããªã±ãŒã·ã§ã³ã·ãŒã³ããµããŒãããŠããã以äžã®äžè¬çãªã·ãŒã³ãå«ãŸããŸãïŒ
- ãã€ããªãã©ãŒãããã®ããŒãµãŒïŒnom ã®æ§èœã¯ C èšèªã§ææžãã®ããŒãµãŒãšåããããéãããããã¡ãªãŒããŒãããŒã®è匱æ§ã«åœ±é¿ããããäžè¬çãªåŠçãã¿ãŒã³ãçµã¿èŸŒãŸããŠããŸãã
- ããã¹ããã©ãŒãããã®ããŒãµãŒïŒCSV ãããè€éãªãã¹ãããããã©ãŒããã JSON ãªã©ãåŠçã§ããããŒã¿ã管çããè€æ°ã®äŸ¿å©ãªããŒã«ãçµã¿èŸŒãŸããŠããŸãã
- ããã°ã©ãã³ã°èšèªã®ããŒãµãŒïŒnom ã¯èšèªã®ãããã¿ã€ãããŒãµãŒãšããŠæ©èœããã«ã¹ã¿ã ãšã©ãŒã¿ã€ããšã¬ããŒããèªåçãªç©ºçœåŠçãAST ã®ã€ã³ãã¬ãŒã¹æ§ç¯ããµããŒãããŸãã
- äžèšã®ã·ãŒã³ã«å ããŠãã¹ããªãŒãã³ã°ãã©ãŒãããïŒHTTP ãããã¯ãŒã¯åŠçãªã©ïŒãããããœãããŠã§ã¢å·¥åŠã«é©ããããŒãµãŒã³ã³ãããŒã¿ãŒãªã©ããããŸãã
䜿çšäŸ#
ããã§ã¯ãnom ãªããžããªã® README ã«æäŸãããŠããã16 鲿°ã«ã©ãŒè§£æåšãã®äŸã玹ä»ããŸãïŒ
ããã§ã® 16 鲿°ã«ã©ãŒã®å ·äœçãªãã©ãŒãããã¯ïŒ
- "#" ã§å§ãŸãããã®åŸã« 6 ã€ã®æåãç¶ããå 2 æåãèµ€ãç·ãéã® 3 ã€ã®è²ãã£ãã«ã®å€ã衚ããŸãã
äŸãã°ã"#2F14DF" 㯠"2F" ãèµ€ãã£ãã«ã®å€ã"14" ãç·ãã£ãã«ã®å€ã"DF" ãéãã£ãã«ã®å€ã衚ããŸãã
- cargo.toml ã« nom äŸåé¢ä¿ã远å
[dependencies]
nom = "7.1.3"
- æ°ãã
src/nom/hex_color.rs
ãäœæããnom ãã€ã³ããŒãã㊠16 鲿°ã«ã©ãŒã®è§£æã¡ãœããhex_color
ãæ§ç¯
tag
ã¯å é ã®æåãã¿ãŒã³ãããããããtag("#")
ã¯é¢æ°ãè¿ãããã®æ»ãå€ã¯IResult<Input,Input,Error>
ã§ãã- ããã§
Input
ã¯é¢æ°ã®å ¥åãã©ã¡ãŒã¿ã®åã§ãæåã®å€ã¯ããããã¿ãŒã³ãé€ããå ¥åå€ã2 çªç®ã¯ãããããå 容ãæåŸã¯ãšã©ãŒå€ã§ãã
- ããã§
- nom ãæäŸãã
take_while_m_n
ã¡ãœããã¯ãæå°ãšæå€§ã®ãããæ°ã瀺ãæåã® 2 ã€ã®ãã©ã¡ãŒã¿ãšããããã«ãŒã«ã瀺ãæåŸã®ãã©ã¡ãŒã¿ãæã¡ãäžèšãšåæ§ã®æ»ãå€ãè¿ããŸãã - nom ãæäŸãã
map_res
ã¡ãœããã¯ãæåã®ãã©ã¡ãŒã¿ããåŸãããçµæãã2 çªç®ã®ãã©ã¡ãŒã¿ã®ãã¿ãŒã³ã«åŸã£ãŠå€æããŸãã - nom ãæäŸãã
tuple
ã¡ãœããã¯ãäžé£ã®ã³ã³ãããŒã¿ãŒãåãåããå ¥åã«é çªã«é©çšããé çªã«è§£æçµæãã¿ãã«åœ¢åŒã§è¿ããŸãã
use nom::{AsChar, IResult};
use nom::bytes::complete::tag;
use nom::bytes::complete::take_while_m_n;
use nom::combinator::map_res;
use nom::sequence::tuple;
#[derive(Debug, PartialEq)]
pub struct Color {
pub red: u8,
pub green: u8,
pub blue: u8,
}
// 16鲿°ã®æ°åãã©ãã
pub fn is_hex_digit(c: char) -> bool {
c.is_hex_digit()
}
// æååã10鲿°ã®çµæã«å€æ
pub fn to_num(input: &str) -> Result<u8, std::num::ParseIntError> {
u8::from_str_radix(input, 16)
}
// is_hex_digitã«ãŒã«ã«åŸã£ãŠå
¥åã2æååäœã§ããããããçµæãto_hex_numã§10鲿°ã®çµæã«å€æ
pub fn hex_primary(input: &str) -> IResult<&str, u8> {
map_res(
take_while_m_n(2, 2, is_hex_digit),
to_num,
)(input)
}
// 16鲿°ã«ã©ãŒã®ããŒãµãŒ
pub fn hex_color(input: &str) -> IResult<&str, Color> {
let (input, _) = tag("#")(input)?;
let (input, (red, green, blue)) = tuple((hex_primary, hex_primary, hex_primary))(input)?;
Ok((input, Color { red, green, blue }))
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_hex_color() {
assert_eq!(hex_color("#2F14DF"), Ok(("", Color {
red: 47,
green: 20,
blue: 223,
})))
}
}
å ·äœçãªäœ¿çš#
ããŒãµãŒçµæ#
åè¿°ã®äŸã§èŠã nom è§£æã¡ãœããã®æ»ãå€IResult
ã¯ãnom ã®ã³ã¢æ§é ã® 1 ã€ã§ãããnom è§£æã®æ»ãçµæã瀺ããŸãã
ãŸããnom ãæ§ç¯ããããŒãµãŒã¯ãè§£æåŸã®çµæã以äžã®ããã«å®çŸ©ããŸãïŒ
Ok(...)
ã¯è§£æãæåããåŸã«èŠã€ãã£ãå 容ã瀺ããErr(...)
ã¯è§£æã察å¿ããå 容ãèŠã€ããããªãã£ãããšã瀺ããŸãã- è§£æãæåããå Žåãæ»ãå€ã¯ã¿ãã«ã§ãæåã®å€ã¯è§£æåšããããããªãã£ããã¹ãŠã®å 容ã2 çªç®ã®å€ã¯è§£æåšãããããããã¹ãŠã®å 容ãå«ã¿ãŸãã
- è§£æã倱æããå Žåãè€æ°ã®ãšã©ãŒãè¿ãããå¯èœæ§ããããŸãã
ââ⺠Ok(
â ããŒãµãŒãè§Šããªãã£ãå
容,
â æ£èŠè¡šçŸã«ãããããå
容
â )
âââââââââââ â
my inputââââºâmy parserââââºeitherâââ€
âââââââââââ ââ⺠Err(...)
ãããã£ãŠããã®ã¢ãã«ã衚ãããã«ãnom ã¯æ§é äœIResult<Input,Output,Error>
ãå®çŸ©ããŠããŸãïŒ
- å®éã«ã¯ Input ãš Output ã¯ç°ãªãåãšããŠå®çŸ©ã§ããError 㯠ParseError ãã¬ã€ããå®è£ ããä»»æã®åã§ãã
ã¿ã°ãšæåã¯ã©ã¹#
- ã¿ã°ãã€ãéåã¿ã°
nom ã¯ã·ã³ãã«ãªãã€ãéåãã¿ã°ãšåŒã³ãŸãããããã¯éåžžã«äžè¬çã§ãããããtag()
颿°ãçµã¿èŸŒãŸããŠãããæå®ãããæååã®ããŒãµãŒãè¿ããŸãã
äŸãã°ãæåå "abc" ãè§£æãããå Žåãtag("abc")
ã䜿çšããŸãã
泚æãå¿ èŠãªã®ã¯ãnom ã«ã¯è€æ°ã®ç°ãªãã¿ã°å®çŸ©ãååšããç¹ã«èª¬æããªãéãã以äžã®å®çŸ©ã䜿çšããããšãäžè¬çã§ãããæå³ããªããšã©ãŒãé¿ããããšãã§ããŸãïŒ
pub use nom::bytes::complete::tag;
tag
颿°ã®ã·ã°ããã£ã¯ä»¥äžã®ããã«ãªããŸããtag
ã¯é¢æ°ãè¿ãããã®é¢æ°ã¯ããŒãµãŒã§ã&str
ãååŸããIResult
ãè¿ããŸãïŒ
pub fn tag<T, Input, Error: ParseError<Input>>(
tag: T
) -> impl Fn(Input) -> IResult<Input, Input, Error> where
Input: InputTake + Compare<T>,
T: InputLength + Clone,
以äžã¯ãtag
ã䜿çšãã颿°ã®å®è£
äŸã§ãïŒ
use nom::bytes::complete::tag;
use nom::IResult;
pub fn parse_input(input: &str) -> IResult<&str, &str> {
tag("abc")(input)
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_parse_input() {
let (leftover_input, output) = parse_input("abcWorld!").unwrap();
assert_eq!(leftover_input, "World!");
assert_eq!(output, "abc");
assert!(parse_input("defWorld").is_err());
}
}
- æåã¯ã©ã¹
ã¿ã°ã¯å é ã®ã·ãŒã±ã³ã¹ã®æåã«ã®ã¿äœ¿çšã§ãããããnom ã¯äºåã«æžãããè§£æåšãæåã¯ã©ã¹ãšåŒã³ãä»»æã®æåã®ã»ãããåãå ¥ããããšãèš±å¯ããŸãã以äžã¯ã䜿çšé »åºŠã®é«ãçµã¿èŸŒã¿è§£æåšã®ããã€ãã瀺ããŸãïŒ
è§£æåš | äœçš | è§£æåš | äœçš |
---|---|---|---|
alpha0/alpha1 | 0 åãŸãã¯è€æ°ã®å°æåããã³å€§æåã®æåãèªèãããåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ±ãã | multispace0/multispace1 | 0 åãŸãã¯è€æ°ã®ç©ºçœãã¿ããæ¹è¡ãèªèãããåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ±ãã |
alphanumeric0/alphanumeric1 | 0 åãŸãã¯è€æ°ã®æ°åãŸãã¯æåãèªèãããåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ±ãã | space0/space1 | 0 åãŸãã¯è€æ°ã®ç©ºçœããã³ã¿ããèªèãããåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ±ãã |
digit0/digit1 | 0 åãŸãã¯è€æ°ã®æ°åãèªèãããåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ±ãã | newline | æ¹è¡ãèªè |
以äžã¯ãã©ã®ããã«äœ¿çšãããã瀺ãç°¡åãªäŸã§ãïŒ
use nom::character::complete::alpha0;
use nom::IResult;
fn parse_alpha(input: &str) -> IResult<&str, &str> {
alpha0(input)
}
#[test]
fn test_parse_alpha() {
let (remaining, letters) = parse_alpha("abc123").unwrap();
assert_eq!(remaining, "123");
assert_eq!(letters, "abc");
}
éžæè¢ãšæ§æ#
- éžæè¢
nom ã¯alt()
ã³ã³ãããŒã¿ãŒãæäŸããè€æ°ã®ããŒãµãŒã®éžæãæºãããŸããããã¯ã¿ãã«å
ã®åè§£æåšãå®è¡ããæåããè§£æåšãèŠã€ãããŸã§ç¶ããŸãã
ã¿ãã«å ã®ãã¹ãŠã®ããŒãµãŒãè§£æã«å€±æããå Žåã®ã¿ããšã©ãŒãè¿ãããŸãã
以äžã¯ã説æã®ããã®ç°¡åãªäŸã§ãïŒ
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::IResult;
fn parse_abc_or_def(input: &str) -> IResult<&str, &str> {
alt((
tag("abc"),
tag("def"),
))(input)
}
#[test]
fn test_parse_abc_or_def() {
let (leftover_input, output) = parse_abc_or_def("abcWorld").unwrap();
assert_eq!(leftover_input, "World");
assert_eq!(output, "abc");
let (_, output) = parse_abc_or_def("defWorld").unwrap();
assert_eq!(output, "def");
assert!(parse_abc_or_def("ghiWorld").is_err());
}
- æ§æ
è€æ°ã®ããŒãµãŒã®éžæã«å ããŠãããŒãµãŒãçµã¿åãããããšãéåžžã«äžè¬çãªèŠæ±ã§ãããããnom ã¯çµã¿èŸŒã¿ã®ã³ã³ãããŒã¿ãŒãæäŸããŸãã
äŸãã°ãtuple()
ã¯è§£æåšã®ã¿ãã«ãåãåããæåããå Žåã¯Ok
ãšãã¹ãŠã®æåããè§£æã®ã¿ãã«ãè¿ããæåã®å€±æããErr
è§£æåšãè¿ããŸãã
use nom::branch::alt;
use nom::bytes::complete::tag_no_case;
use nom::IResult;
use nom::sequence::tuple;
fn parse_base(input: &str) -> IResult<&str, &str> {
alt((
tag_no_case("a"), // 倧æåå°æåãåºå¥ããªãã¿ã°
tag_no_case("t"),
tag_no_case("c"),
tag_no_case("g"),
))(input)
}
fn parse_pair(input: &str) -> IResult<&str, (&str, &str)> {
tuple((
parse_base, parse_base
))(input)
}
#[test]
fn test_parse_pair() {
let (remaining, parsed) = parse_pair("aTcG").unwrap();
assert_eq!(parsed, ("a", "T"));
assert_eq!(remaining, "cG");
assert!(parse_pair("Dct").is_err());
}
äžèšã§èšåããããã«ãå®éã«ã¯ Rust ã¯ä»¥äžã®ãããªé¡äŒŒã®æäœãæã€ããŒãµãŒããµããŒãããŠããŸãã
ã³ã³ãããŒã¿ãŒ | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|
delimited | delimited(char('('), take(2), char(')')) | "(ab)cd" | Ok(("cd", "ab")) |
preceded | preceded(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", "XY")) |
terminated | terminated(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", "ab")) |
pair | pair(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", ("ab", "XY"))) |
separated_pair | separated_pair(tag("hello"), char(','), tag("world")) | "hello,world!" | Ok(("!", ("hello", "world"))) |
ã«ã¹ã¿ã æ»ãå€ã¿ã€ãã®ããŒãµãŒ#
IResult
ã® Input ãš Output ã¯å®éã«ã¯ç°ãªãåãšããŠå®çŸ©ã§ãããããã¿ã°ã®çµæãç¹å®ã®å€ã«å€æãããå Žåãnom ã¯æåããçµæãç¹å®ã®å€ã«å€æããããã®value
ã³ã³ãããŒã¿ãŒãæäŸããŸãã以äžã¯ãã®äœ¿çšäŸã§ãïŒ
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::combinator::value;
use nom::IResult;
fn parse_bool(input: &str) -> IResult<&str, bool> {
alt((
value(true, tag("true")), // boolåã«å€æ
value(false, tag("false")),
))(input)
}
#[test]
fn test_parse_bool() {
let (remaining, parsed) = parse_bool("true|false").unwrap();
assert_eq!(parsed, true);
assert_eq!(remaining, "|false");
assert!(parse_bool(remaining).is_err());
}
ç¹°ãè¿ãã®è¿°èªãšããŒãµãŒ#
- è¿°èªã«ããç¹°ãè¿ã
è¿°èªã¯ãç¹å®ã®æ¡ä»¶ãæºããããã«ç¹°ãè¿ãããè§£æåšåŠçã®æ©èœãæºããããã«ãnom ã¯ããã€ãã®ç°ãªãã«ããŽãªã®è¿°èªè§£æåšãæäŸããŸããäž»ã«take_till
ãtake_until
ãtake_while
ã® 3 ã€ã®ã«ããŽãªããããŸãïŒ
ã³ã³ãããŒã¿ãŒ | äœçš | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|---|
take_till | å ¥åãè¿°èªãæºãããŸã§æç¶çã«æ¶è²»ãã | take_while(is_alphabetic) | "abc123" | Ok(("123", "abc")) |
take_while | å ¥åãè¿°èªãæºãããªããŸã§æç¶çã«æ¶è²»ãã | take_till(is_alphabetic) | "123abc" | Ok(("abc", "123")) |
take_until | è¿°èªãæåã«çŸãããŸã§æ¶è²»ãã | take_until("world") | "Hello World" | Ok(("World", "Hello ")) |
ããã§è£è¶³ãããšïŒ
- äžè¿°ã®ã³ã³ãããŒã¿ãŒã«ã¯å®éã«ãååããååšããååã®æ«å°Ÿã«
1
ãä»ããŠããŸããããã¯ãå°ãªããšã 1 ã€ã®ãããæåãè¿ãããšãèŠæ±ãããã®ã§ãããã§ãªããã°ãšã©ãŒãçºçããŸãã - åè¿°ã®
take_while_m_n
ã¯ãå®éã«ã¯take_while
ã®ç¹æ®ãªã±ãŒã¹ã§ããã[m,n]
ãã€ããæ¶è²»ããããšãä¿èšŒããŸãã
- ããŒãµãŒã®ç¹°ãè¿ã
åäžã®è§£æåšã®ç¹°ãè¿ãã«å ããŠãnom ã¯ç¹°ãè¿ãããŒãµãŒã®ã³ã³ãããŒã¿ãŒãæäŸããŸããäŸãã°ãmany0
ã¯ã§ããã ãå€ãã®åæ°ã§è§£æåšãé©çšãããããã®è§£æçµæã®ãã¯ã¿ãŒãè¿ããŸãã以äžã¯ãã®äœ¿çšäŸã§ãïŒ
use nom::bytes::complete::tag;
use nom::IResult;
use nom::multi::many0;
fn repeat_parser(s: &str) -> IResult<&str, Vec<&str>> {
many0(tag("abc"))(s)
}
#[test]
fn test_repeat_parser() {
assert_eq!(repeat_parser("abcabc"), Ok(("", vec!["abc", "abc"])));
assert_eq!(repeat_parser("abc123"), Ok(("123", vec!["abc"])));
assert_eq!(repeat_parser("123123"), Ok(("123123", vec![])));
assert_eq!(repeat_parser(""), Ok(("", vec![])));
}
以äžã«ãäžè¬çã«äœ¿çšãããã³ã³ãããŒã¿ãŒãããã€ã瀺ããŸãïŒ
ã³ã³ãããŒã¿ãŒ | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|
count | count(take(2), 3) | "abcdefgh" | Ok(("gh", vec!["ab", "cd", "ef"])) |
many0 | many0(tag("ab")) | "abababc" | Ok(("c", vec!["ab", "ab", "ab"])) |
many_m_n | many_m_n(1, 3, tag("ab")) | "ababc" | Ok(("c", vec!["ab", "ab"])) |
many_till | many_till(tag( "ab" ), tag( "ef" )) | "ababefg" | Ok(("g", (vec!["ab", "ab"], "ef"))) |
separated_list0 | separated_list0(tag(","), tag("ab")) | "ab,ab,ab." | Ok((".", vec!["ab", "ab", "ab"])) |
fold_many0 | fold_many0(be_u8, || 0, |acc, item| acc + item) | [1, 2, 3] | Ok(([], 6)) |
fold_many_m_n | fold_many_m_n(1, 2, be_u8, || 0, |acc, item| acc + item) | [1, 2, 3] | Ok(([3], 3)) |
length_count | length_count(number, tag("ab")) | "2ababab" | Ok(("ab", vec!["ab", "ab"])) |
ãšã©ãŒãããžã¡ã³ã#
nom ã®ãšã©ãŒã¯ãããŸããŸãªããŒãºãèæ ®ããŠèšèšãããŠããŸãïŒ
- ã©ã®ããŒãµãŒã倱æããããå ¥åããŒã¿å ã®äœçœ®ã瀺ã
- ãšã©ãŒãè§£æåšãã§ãŒã³ãäžã«äŒæããéã«ãããå€ãã®ã³ã³ããã¹ããèç©ãã
- éåžžãè§£æåšãåŒã³åºãéã«ãšã©ãŒãç Žæ£ãããããéåžžã«äœããªãŒããŒããã
- ãŠãŒã¶ãŒã®ããŒãºã«å¿ããŠå€æŽå¯èœã§ãç¹å®ã®èšèªã§ã¯ããå€ãã®æ å ±ãå¿ èŠã§ã
ãããã®ããŒãºãæºããããã«ãnom è§£æåšã®çµæã¿ã€ãã¯ä»¥äžã®ããã«èšèšãããŠããŸãïŒ
pub type IResult<I, O, E=nom::error::Error<I>> = Result<(I, O), nom::Err<E>>;
pub enum Err<E> {
Incomplete(Needed), // è§£æåšã決å®ãäžãã®ã«ååãªããŒã¿ããªãããšã瀺ããéåžžãã¹ããªãŒãã³ã°ã·ãŒã³ã§ééããŸãã
Error(E), // éåžžã®è§£æåšãšã©ãŒãäŸãã°ãaltã³ã³ãããŒã¿ãŒã®ãµãããŒãµãŒãErrorãè¿ããšãä»ã®ãµãããŒãµãŒã詊ã¿ãŸãã
Failure(E), // å埩äžå¯èœãªãšã©ãŒãäŸãã°ããµãããŒãµãŒãFailureãè¿ããšãaltã³ã³ãããŒã¿ãŒã¯ä»ã®ãã©ã³ãã詊ã¿ãŸããã
}
- **
nom::Err<E>
** ã®äžã®äžè¬çãªãšã©ãŒã¿ã€ã
-
ããã©ã«ãã®ãšã©ãŒã¿ã€ã
nom::error::Error
ã¯ãå ·äœçã«ã©ã®ããŒãµãŒã®ãšã©ãŒã§ãããããšã©ãŒã®å ¥åäœçœ®ãè¿ããŸãã#[derive(Debug, PartialEq)] pub struct Error<I> { /// å ¥åããŒã¿å ã®ãšã©ãŒã®äœçœ® pub input: I, /// nomãšã©ãŒã³ãŒã pub code: ErrorKind, }
- ãã®ãšã©ãŒã¿ã€ãã¯é床ãéãããªãŒããŒããããäœããããç¹°ãè¿ãåŒã³åºãããè§£æåšã«é©ããŠããŸãããæ©èœã¯éãããŠããŸããäŸãã°ãåŒã³åºããã§ãŒã³æ å ±ã¯è¿ãããŸããã
-
ããå€ãã®æ å ±ãååŸããããã«
nom::error::VerboseError
ã䜿çšãããšããšã©ãŒãçºçããè§£æåšãã§ãŒã³ã®ããå€ãã®æ å ±ïŒè§£æåšã¿ã€ããªã©ïŒãè¿ããŸãã#[derive(Clone, Debug, PartialEq)] pub struct VerboseError<I> { /// `VerboseError`ã«ãã£ãŠèç©ããããšã©ãŒã®ãªã¹ãã圱é¿ãåããå ¥åããŒã¿ã®éšåãšããã€ãã®ã³ã³ããã¹ããå«ãã pub errors: crate::lib::std::vec::Vec<(I, VerboseErrorKind)>, } #[derive(Clone, Debug, PartialEq)] /// `VerboseError`ã®ãšã©ãŒã³ã³ããã¹ã pub enum VerboseErrorKind { /// `context`颿°ã«ãã£ãŠè¿œå ãããéçæåå Context(&'static str), /// `char`颿°ã«ãã£ãŠæåŸ ãããæåã瀺ã Char(char), /// æ§ã ãªnomããŒãµãŒã«ãã£ãŠäžãããããšã©ãŒã®çš®é¡ Nom(ErrorKind), }
- å
ã®å
¥åãšãšã©ãŒã®ãã§ãŒã³ã確èªããããšã§ããããŠãŒã¶ãŒãã¬ã³ããªãŒãªãšã©ãŒã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
nom::error::convert_error
颿°ã䜿çšãããšããã®ãããªã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
- å
ã®å
¥åãšãšã©ãŒã®ãã§ãŒã³ã確èªããããšã§ããããŠãŒã¶ãŒãã¬ã³ããªãŒãªãšã©ãŒã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
- ParseError ãã¬ã€ãã«ããã«ã¹ã¿ã ãšã©ãŒã¿ã€ã
ParseError<I>
ãã¬ã€ããå®è£
ããããšã§ãç¬èªã®ãšã©ãŒã¿ã€ããå®çŸ©ã§ããŸãã
ãã¹ãŠã® nom ã³ã³ãããŒã¿ãŒã¯ãã®ãšã©ãŒã«å¯ŸããŠäžè¬çã§ãããããè§£æåšçµæã¿ã€ãã§ãããå®çŸ©ããã ãã§ãã©ãã§ã䜿çšãããŸãã
pub trait ParseError<I>: Sized {
// å
¥åäœçœ®ãšErrorKindåæåã«åºã¥ããŠãã©ã®è§£æåšã§ãšã©ãŒãçºçãããã瀺ã
fn from_error_kind(input: I, kind: ErrorKind) -> Self;
// è§£æåšããªãŒãããã¯ãã©ãã¯ããéã«ããšã©ãŒã®äžé£ã®ãšã©ãŒãäœæããããšãèš±å¯ããïŒããŸããŸãªã³ã³ãããŒã¿ãŒãããå€ãã®ã³ã³ããã¹ãã远å ããŸãïŒ
fn append(input: I, kind: ErrorKind, other: Self) -> Self;
// ã©ã®æåãæåŸ
ããããã瀺ããšã©ãŒãäœæãã
fn from_char(input: I, _: char) -> Self {
Self::from_error_kind(input, ErrorKind::Char)
}
// altã®ãããªã³ã³ãããŒã¿ãŒã§ãããŸããŸãªãã©ã³ãããã®ãšã©ãŒã®éã§éžæïŒãŸãã¯ããããèç©ïŒã§ããããã«ãã
fn or(self, other: Self) -> Self {
other
}
}
ãŸããContextError
ãã¬ã€ããå®è£
ããããšã§ãVerboseError<I>
ã䜿çšããcontext()
ã³ã³ãããŒã¿ãŒããµããŒãã§ããŸãã
以äžã¯ããã®äœ¿çšæ³ã玹ä»ããç°¡åãªäŸã§ããããã§ã¯ããããã°ãšã©ãŒã¿ã€ããå®çŸ©ãããšã©ãŒãçæãããã³ã«è¿œå æ å ±ãå°å·ããŸãïŒ
use nom::error::{ContextError, ErrorKind, ParseError};
#[derive(Debug)]
struct DebugError {
message: String,
}
impl ParseError<&str> for DebugError {
// å
·äœçãªãšã©ãŒã®è§£æåšã¿ã€ããå°å·
fn from_error_kind(input: &str, kind: ErrorKind) -> Self {
let message = format!("ã{:?}ã:\t{:?}\n", kind, input);
println!("{}", message);
DebugError { message }
}
// è€æ°ã®ãšã©ãŒã«ééããå Žåãä»ã®ã³ã³ããã¹ãæ
å ±ãå°å·
fn append(input: &str, kind: ErrorKind, other: Self) -> Self {
let message = format!("ã{}{:?}ã:\t{:?}\n", other.message, kind, input);
println!("{}", message);
DebugError { message }
}
// æåŸ
ãããå
·äœçãªæåãå°å·
fn from_char(input: &str, c: char) -> Self {
let message = format!("ã{}ã:\t{:?}\n", c, input);
print!("{}", message);
DebugError { message }
}
fn or(self, other: Self) -> Self {
let message = format!("{}\tOR\n{}\n", self.message, other.message);
println!("{}", message);
DebugError { message }
}
}
impl ContextError<&str> for DebugError {
fn add_context(_input: &str, _ctx: &'static str, other: Self) -> Self {
let message = format!("ã{}ã{}ãã:\t{:?}\n", other.message, _ctx, _input);
print!("{}", message);
DebugError { message }
}
}
- ãããã°ããŒãµãŒ
ããŒãµãŒãäœæããéçšã§ãè§£æåšã®å®è¡ããã»ã¹æ
å ±ã远跡ããå¿
èŠãããå Žåãdbg_dmp
颿°ã䜿çšããŠè§£æåšã®å
¥åãšåºåãå°å·ã§ããŸãïŒ
fn f(i: &[u8]) -> IResult<&[u8], &[u8]> {
dbg_dmp(tag("abcd"), "tag")(i)
}
let a = &b"efghijkl"[..];
// 次ã®ã¡ãã»ãŒãžãå°å·ãããŸãïŒ
// tag: Error(Error(Error { input: [101, 102, 103, 104, 105, 106, 107, 108], code: Tag })) at:
// 00000000 65 66 67 68 69 6a 6b 6c efghijkl
f(a);
ãŸãšã#
ãã®èšäºãéããŠãç§ãã¡ã¯åºæ¬çã«ããŒãµãŒã®åæç¥èïŒããŒãµãŒãPEGãããŒãµãŒã³ã³ãããŒã¿ãŒïŒãšãRust ã§ããŒãµãŒãå®çŸããããã«å¿ èŠãªãµãŒãããŒãã£ã©ã€ãã©ãªïŒpest ãš nomïŒã®äœ¿ç𿹿³ãçè§£ããŸãããPEG ã䜿çšããŠå®çŸããã pest ã§ããããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšããŠå®çŸããã nom ã§ããã«ã¹ã¿ã ããŒãµãŒã®å®çŸã«å¿ èŠãªäžè¬çãªã·ãŒã³ãæºããããšãã§ãããŒãããææžãã®ããŒãµãŒãäœæããå¿ èŠã¯ãããŸãããããã«ãããã«ã¹ã¿ã ããŒãµãŒã®å®çŸã³ã¹ããå€§å¹ ã«åæžãããŸããããè€éãªç¶æ³ïŒæ§èœãå®çŸã³ã¹ããäœ¿çšææžãªã©ã®èŠå ïŒãèæ ®ããå Žåã¯ãå ·äœçãªã·ãŒã³ã«å¿ããŠé©åãªãµãŒãããŒãã£ã©ã€ãã©ãªãéžæããŠå®çŸããå¿ èŠããããŸãã
次åã®èšäºã§ã¯ãpest ãš nom ã䜿çšããŠããã€ãã®äžè¬çãªããŒãµãŒãå®è£ ããããŒãµãŒã®èŠç¹ãããããçè§£ããŸãã
åèæç®#
https://zhuanlan.zhihu.com/p/427767002
https://zh.wikipedia.org/wiki/%E8%A7%A3%E6%9E%90%E8%A1%A8%E8%BE%BE%E6%96%87%E6%B3%95
https://zhuanlan.zhihu.com/p/355364928
https://ohmyweekly.github.io/notes/2021-01-20-pest-grammars/#
https://pest.rs/book/parser_api.html
https://rustmagazine.github.io/rust_magazine_2021/chapter_4/nom_url.html