After Tuesday's announcement about SWF now getting fully indexed I thought I'd do a little experiment and put up a few test SWF files. Its difficult to accurately deduct what exactly is happening but thought I'd write down what I've tried and what the results are thus far.
 

What did I use?

I created a Flash 9 SWF exported from Flash CS3, added some component instances in a variety of ways, added an input text field and set up a function that triggers a PHP script on my server and subsequently sends me an email with the value of the input text field.

Embed methods

Embedded SWF with object/embed tag, SWFObject and the standard publish from Flash CS3 (i.e. AC_FL_RunContent).

Result: all three SWF files were getting hits from Google searchbot and triggered an email to be sent, no arguments were being sent to the script either through POST or GET.

What is getting indexed?

Manually added a Button component instance on Stage, programmatically added one to the DisplayList, instantiated one but didn't add it to the DisplayList and added one to the DisplayList outside of the Stage bounds so not visible to the user.

Result: of these four only two got a MouseEvent.CLICK triggered, the one manually added and the one programmatically added within the visual bounds of the Stage.

Trace statements

Added some trace statements throughout the code to see if those would get picked up.

Result: trace statements do not appear to be getting indexed.
 

Preliminary conclusion

I didn't have a lot of static text and no dynamically loaded text to be indexed in my test SWF. I'm working on an updated version of the test SWF to put up and look into what exactly is happening with that, see what and how it gets indexed.

This morning I got a comment on my previous blog post by my brother Kristof saying he noticed Google was now indexing URLs to photographs and music files from his band he referenced from his Flash content. From what I can see what Google has done there is follow a reference to an XML file and indexed that file containing the URLs.

This is what Google says: "We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource.."

http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html
 
Why? Adobe, please tell me why this is a good thing and how this would help SEO of Flash content. It makes no sense whatsoever to index calls to .xml files and server-side scripts referenced from an SWF and link to those URLs.

Just to make this clear, if you do a filetype:swf search in Google no dynamically loaded data will show up. What happens instead is the URLs you use in your SWF get crawled separately. You'll increasing start seeing .xml, .php etc. file show up in the search results that are used in your SWF but do not link to your SWF file that uses it but that .xml, .php, etc. file itself.

In short:

- Google follows URLRequest links, indexes XML and other referenced files in your SWF that return text (though not in context of the SWF, i.e. links to those URLs directly rather than the SWF that uses it) - Only instances added to the DisplayList and visual on stage or getting triggered - Using URLVariables, no values seem to get sent along with the URLRequest
 

Remaining issues

These are two things I've seen happen that could be troublesome:

- URLs to files loaded in from SWF content are getting exposed in search results (and not in reference to the SWF that uses it) - Server-side scripts referenced in the SWF are getting hits from search bots, potentially causing unwanted behavior.
 
I really want to see Adobe, Google and Yahoo! urgently come out with additional information for developers on how to prevent unwanted files getting indexed, how the indexing works for the various search engines and how they individually handle things like follow URLRequests etc.
 

Posted
AuthorPeter
CategoriesGeneral