How do I get the value attribute based on a search of some other attribute?
For example:
<body>
<input name="dummy" value="foo">
<input name="alpha" value="bar">
</body>
How do I get the value of the input element with the name "dummy"?
How do I get the value attribute based on a search of some other attribute?
For example:
<body>
<input name="dummy" value="foo">
<input name="alpha" value="bar">
</body>
How do I get the value of the input element with the name "dummy"?
 
    
     
    
    Since you're looking for a solution using bash and sed, I'm assuming you're looking for a Linux command line option.
hxselect html parsing tool to extract element; use sed to extract value from elementI did a Google search for "linux bash parse html tool" and came across this: https://unix.stackexchange.com/questions/6389/how-to-parse-hundred-html-source-code-files-in-shell
The accepted answer suggests using the hxselect tool from the html-xml-utils package which extracts elements based on a css selector. 
So after installing (downoad, unzip, ./configure, make, make install), you can run this command using the given css selector
hxselect "input[name='dummy']" < example.html
(Given that example.html contains your example html from the question.) This will return:
<input name="dummy" value="foo"/>
Almost there. We need to extract the value from that line:
hxselect "input[name='dummy']" < example.html | sed -n -e "s/^.*value=['\"]\(.*\)['\"].*/\1/p"
Which returns "foo".
 
    
     
    
    Since you're asking for SED, I'll assume you want a command line option. However, a tool built for html parsing may be more effective. The problem with my first answer is that I don't know of a way in css to select the value of an attribute (does anyone else?). However, with xml you can select attributes like you could other elements. Here is a command line option for using an xml parsing tool.
xmlstarlet with your package managerxmlstarlet sel -t -v //input[@name=\'dummy\']/@value example.html (where example.html contains your html<input> must be changed to <input/>foo 
    
     
    
    Parsing HTML with sed is generally a bad idea, since sed works in a line-based manner and HTML does not usually consider newlines syntactically important. It's not good if your HTML-handling tools break when the HTML is reformatted.
Instead, consider using Python, which has an HTML push parser in its standard library. For example:
#!/usr/bin/python
from HTMLParser import HTMLParser
from sys import argv
# Our parser. It inherits the standard HTMLParser that does most of
# the work.
class MyParser(HTMLParser):
    # We just hook into the handling of start tags to extract the
    # attribute
    def handle_starttag(self, tag, attrs):
        # Build a dictionary from the attribute list for easier
        # handling
        attrs_dict = dict(attrs)
        # Then, if the tag matches our criteria
        if tag == 'input' \
           and 'name' in attrs_dict \
           and attrs_dict['name'] == 'dummy':
            # Print the value attribute (or an empty string if it
            # doesn't exist)
            print attrs_dict['value'] if 'value' in attrs_dict else ""
# After we defined the parser, all that's left is to use it. So,
# build one:
p = MyParser()
# And feed a file to it (here: the first command line argument)
with open(argv[1], 'rb') as f:
    p.feed(f.read())
Save this code as, say, foo.py, then run
python foo.py foo.html
where foo.html is your HTML file.
