I have two files; the first looks like this,
        SomeFile.CEL    SomeOtherFile.CEL
probe1  111     666
probe2  222     777
probe3  333     888
probe4  444     999
probe5  555     100
probe6  101     102
and the second looks like this (note: the duplicate probe4, has two different gene names),
probe1  Gene1
probe2  Gene2
probe3  Gene3
probe4  Gene4A
probe4  Gene4B
probe5  Gene5
probe7  Gene6
What I need is an output file to look like this,
Gene1 111 666
Gene2 222 777
Gene3 333 888
Gene4A 444 999
Gene4B 444 999
Gene5 555 100
This ideal output file would contain all of the gene names which had matched probe names between the two files. Additionally, where multiple names exist for a single probe, I want the expression data (444 999) to be duplicated for all possible gene names (this example shows 2 gene names for a single probe, but it could be as many as 5 or 6.) By the way all files are tab-separated.
I've searched through this and other forums and while these article came close,
Replace multiple arguments with gsub
awk print column $3 if $2==a specific value?
awk partly string match (if column partly matches)
Sed pattern to match and replace
they don't answer my full question.
So far, I have had the most success with this command,
awk -F"\t" 'FILENAME=="input1.file"{a[$1]=$1} FILENAME=="input2.file {if(a[$1]){$1="";print $0}}'   input1.file input2.file
but it doesn't account for the necessary duplication. Lastly, there are some files which look like input1, but contain more than just the two samples I described (someFile.CEL and someOtherFile.CEL). There could be as many as 50 samples/CEL files. I figure I might have to build a script, but I thought I'd ask if there was a simpler way first.