I'm working on this project that should scrape websites and output HTML in the form of a JSON, now the only useful things in those JSONs to us are "forms".
I wanted to filter that but the native array filter only works when I know the attribute's location relative to the entire page (DOM??) but that won't always be the case, and I fear checking every object's value till I reach the desired value isn't viable due to
- some pages being humongous,
- form being a string in other places we don't want, this is in NodeJS
Snippet of input:
[
  {
    "type": "element",
    "tagName": "p",
    "attributes": [],
    "children": [
      {
        "type": "text",
        "content": "This is how the HTML code above will be displayed in a browser:"
      }
    ]
  },
  {
    "type": "text",
    "content": "\n"
  },
  {
    "type": "element",
    "tagName": "form",
    "attributes": [
      {
        "key": "action",
        "value": "/action_page.php"
      },
      {
        "key": "target",
        "value": "_blank"
      }
    ],
    "children": [
      {
        "type": "text",
        "content": "\nFirst name:"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "text"
          },
          {
            "key": "name",
            "value": "firstname0"
          },
          {
            "key": "value",
            "value": "John"
          }
        ],
        "children": []
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\nLast name:"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "text"
          },
          {
            "key": "name",
            "value": "lastname0"
          },
          {
            "key": "value",
            "value": "Doe"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "submit"
          },
          {
            "key": "value",
            "value": "Submit"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "reset"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      }
    ]
  },
  {
    "type": "text",
    "content": "\n"
  }
]
A snippet of output:
[
  {
    "type": "element",
    "tagName": "form",
    "attributes": [
      {
        "key": "action",
        "value": "/action_page.php"
      },
      {
        "key": "target",
        "value": "_blank"
      }
    ],
    "children": [
      {
        "type": "text",
        "content": "\nFirst name:"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "text"
          },
          {
            "key": "name",
            "value": "firstname0"
          },
          {
            "key": "value",
            "value": "John"
          }
        ],
        "children": []
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\nLast name:"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "text"
          },
          {
            "key": "name",
            "value": "lastname0"
          },
          {
            "key": "value",
            "value": "Doe"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "element",
        "tagName": "br",
        "attributes": [],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "submit"
          },
          {
            "key": "value",
            "value": "Submit"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      },
      {
        "type": "element",
        "tagName": "input",
        "attributes": [
          {
            "key": "type",
            "value": "reset"
          }
        ],
        "children": []
      },
      {
        "type": "text",
        "content": "\n"
      }
    ]
  }
]
TL;DR: only retain forms and any of its children.
 
    