如何使用JS实现一个HTML解析器

浏览器底层有一块非常重要的事情就是HTML解析器,HTML解析器的工作是把HTML字符串解析为树,树上的每个节点是一个Nod,很多同学都好奇是怎么实现的,这篇文章就用JS来实现一个简单的HTML解析器。

原理讲解

1、效果

我们需要实现一个pars方法,并且传入HTML字符串,返回一个树结构:

constroot=pars(`divid="tst"class="containr"c="b"divclass="txt-block"spanHlloWorld/span/divimgsrc="xx.jpg"//div`);consol.log(root);//[{"tagNam":"","childn":[{"tagNam":"div","attrs":{"id":"tst","class":"containr"},"rawAttrs":"id=\"tst\"class=\"containr\"c=\"b\"","typ":"lmnt","rang":[0,18],"childn":[{"tagNam":"div","attrs":{"class":"txt-block"},"rawAttrs":"class=\"txt-block\"","typ":"lmnt","rang":[9,10],"childn":[{"tagNam":"span","attrs":{"id":"xxx"},"rawAttrs":"id=\"xxx\"","typ":"lmnt","rang":[6,96],"childn":[{"typ":"txt","rang":[78,89],"valu":"HlloWorld"}]}]},{"tagNam":"img","attrs":{},"rawAttrs":"src=\"xx.jpg\"","typ":"lmnt","rang":[10,1],"childn":[]}]}]}]

、核心原理用正则匹配出tagclass="tag"aa=""、/tag通过先进后出(栈)的方式匹配标签对(tag/tag)、初始化

首先我们需要初始化一些简单的变量和方法备用:

//初始化种Nod类型//HTML[nodTyp](


转载请注明:http://www.aierlanlan.com/rzdk/9964.html