The problem is you're trying to use the starts_with method on an object that doesn't implement it.
item.css("a").each do |a|
will return XML nodes in a. Those belong to Nokogiri. What you want to do is convert the node to text, but only the part you want to check, which, because it's a parameter of the node, can be accessed like this:
a['href']
So, you want to use something like this:
item.css("a").each do |a|
if !(a.starts_with?['href']('http://'))
a.replace(a.content)
end
end
The downside to this is you have to walk through every <a> tag in the document, which can be slow on a big page with lots of links.
An alternate way to go about it is to use XPath's starts-with function:
require 'nokogiri'
item = Nokogiri::HTML('<a href="doesnt_start_with">foo</a><a href="http://bar">bar</a>')
puts item.to_html
which outputs:
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
>> <html><body>
>> <a href="doesnt_start_with">foo</a><a href="http://bar">bar</a>
>> </body></html>
Here's how to do it using XPath:
item.search('//a[not(starts-with(@href, "http://"))]').each do |a|
a.replace(a.content)
end
puts item.to_html
Which outputs:
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
>> <html><body>foo<a href="http://bar">bar</a>
>> </body></html>
The advantage to using XPath to find the nodes is it all runs in compiled C, rather than letting Ruby do it.