HTMLQ

Page content

stumpled upon some thing cool, htmlq! It’s like jq, but for HTML.

Installation Rust

htmlq need rust. so, let’s install rust first.

doas pkg_add rust
cat << 'EOF' |doas tee -a /etc/profile
# Rust/Cargo
export PATH=$PATH:/root/.cargo/bin

EOF
. /etc/profile

Install HTMLQ

doas cargo install htmlq

some Examples

curl -s https://www.openbsd.org | htmlq --attribute href a |head

Example

user@nixbox$ curl -s https://www.openbsd.org | htmlq --attribute href a |head
goals.html
plat.html
security.html
crypto.html
events.html
innovations.html
faq/faq4.html#Download
anoncvs.html
https://cvsweb.openbsd.org/
https://github.com/openbsd
curl --silent https://www.nytimes.com | htmlq a --attribute href -b https://www.nytimes.com

Example

user@nixhost$ curl --silent https://www.nytimes.com | htmlq a --attribute href -b https://www.nytimes.com |head
https://www.nytimes.com/#site-content
https://www.nytimes.com/#site-index
https://www.nytimes.com/
https://www.nytimes.com/
https://www.nytimes.com/international/?action=click&region=Editions&pgtype=Homepage
https://www.nytimes.com/ca/?action=click&region=Editions&pgtype=Homepage
https://www.nytimes.com/es/
https://cn.nytimes.com/
https://myaccount.nytimes.com/auth/login?response_type=cookie&client_id=vi&redirect_uri=https%3A%2F%2Fwww.nytimes.com%2Fsubscription%2Fonboarding-offer%3FcampaignId%3D7JFJX&asset=masthead

Dump HTML Body, Highlight with BAT

curl --silent https://blog.stoege.net | htmlq 'body' | bat --language html

Example

───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ STDIN
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ <body class="body">
   2   │     <div class="container container--outer">
   3   │         <header class="header">
   4   │     <div class="container header__container">
   5   │         
   6   │     <div class="logo">
   7   │         <a class="logo__link" href="/" rel="home" title="blog-stöge-net">
   8   │             <div class="logo__item logo__text">
   9   │                     <div class="logo__title">blog-stöge-net</div>
  10   │                     <div class="logo__tagline">BSD is for people who love Unix, Linux is for people who hate Micr
       │ osoft</div>
  11   │                 </div>
  12   │         </a>
  13   │     </div>

sha256: 716535b98edfc4992c8fed3f26938efeeb74e1a8464c9535238dbfad16c0354f