Migrating Documents to Content in Salesforce.com

Use this guide to start migrating documents and attachments to content in Salesforce.com.

Our fund management software experts compiled their knowledge of AIM and Salesforce into an easy to understand conversion document.

Note: there is a 5,000 file per day limit on Content submissions.

Convert the Org to Use Content

  • Enable Content and add all users as Content users
  • Replace Attachments related lists throughout the system and add related lists for Content (done by adding a lookup to the objects as custom fields on Content-Type layouts)
  • No new attachments can be added during the migration (might have to do this after-hours to limit disruption)

Exports You will Need

Attachments & Document Files: Start by doing a full Export including all files from within SF. This will get you all the attachments, documents, etc. named by their record ID in SF.

  • Setup->Data Management->Data Export->Export Now
  • Be sure to check “Include in Export” at top and “Include all Data” at bottom
  • Note: If an export has been done in the last 48 hours, you will not have the option to “Export Now”. You will have to wait.
  • The Export will complete in 5 minutes to 2 hours depending on its size and SF system availability and an email will be sent to the app-xprod email address. Alternatively, you can periodically refresh the page
  • One or more .zip files should be on the Data Export screen after it has completed
  • Download all of them…they will only be available for 48 hours
  • Place the .zip files somewhere on your local machine
  • Extract all of them within a folder:
  1. Will result in a bunch of .csv files of the data. These can be deleted or filed elsewhere
  2. Will result in Attachments folder with all attachments. You will need these. Make sure they are all in the same folder
  3. Will result in Documents folder with all documents. If migrating these, you will also need these

Attachments Data.csv: Export all current attachments in Data Loader. Exclude the column “Body” from the export.

  • Keep the original Attachments export .csv in a safe place. You’ll probably need to refer to it.
  • Verify the number of Attachments in this export matches the number you have in the Attachments folder from the Export. Ditto if you are also migrating Documents.

ContentVersion.csv: Export the current CONTENT (Content Version) table in Data Loader. Exclude the column “VERSIONDATA” from the Export.

  • Keep original file as you’ll probably need it later.

Build Your Import File

Build your Content Version .csv by first exporting a test record from the Content Version table in data loader. Exclude the “VERSIONDATA” column from the export. Include the following columns or delete the rest after export.

  • ID: Unique ID of Current version of Document that is displayed when you click on a document; Will be blank in the import file
  • CONTENTDOCUMENTID: Unique ID of a Content Document. Different than above; Will be blank in the import file
  • TITLE: This is what the document will be called; Title of the document from the Attachments table or other sources
  • VERSIONDATA: Full file path to the document being inserted without file type extension (i.e. C:ClientsAttachments0PA000002gpI3Mai). See Helpful Hints below; Must be populated and without file extension
  • PATHONCLIENT: Full file path to the document being inserted with file type extension (i.e. C:ClientsAttachments0PA000002gpI3Mai.pdf). This tells Content what the file type is so that the document can be previewed in SF. If extension not assigned correctly, the import is pretty much worthless. See Helpful Hints below; Must be populated and with the file extension
  • OWNERID: In the case of attachments related to other records, this is the owner id (18 char) of the other record, not the attachment (depending on how attachments were loaded, you may be able to simply take the owner of the attachment). This is because some attachments have the owner as the Admin if they were loaded via an API call. You want the actual Owner as any security in the org is driven off of that (Requires additional vLookup to the export of the related records tables); must be populated
  • FIRSTPUBLISHLOCATIONID: This is the workspace ID (18 char) that you are inserting each document into; Must be populated
  • RECORDTYPEID: This is the ContentType ID (18 char) of the Content-type you want to assign to this document; Must be populated if more than one Content-Type defined in the org. Otherwise, it defaults to General and you don’t need the column
  • ALL APPROPRIATE LOOKUPS to other objects (i.e. Contact__C, Account__C); Optional
  • ALL APPROPRIATE FILTERS (i.e. Document_Type__C, Year__C. Needs to be populated manually if you want to filter information; Optional
  • ALL APPROPRIATE LEGACY IDS: I like to add the id of the document from where it came from (may be internal or external IDs); Optional, but highly suggested

Helpful Hints

  • Build your VERSION DATA: Add the static path in one cell in your .csv (i.e. G2 has “C:ClientsAttachments” in it). Add the appropriate attachment ID for this document (i.e. H2 has “00PA000002gpI3Mai” in it). Build the file path by using concatenate function below; G2&H2; Should result in C:ClientsAttachments0PA000002gpI3Mai
  • Determine File Type: Use formula below to extract the characters after the period at the end of the title of the document to grab the file extension. A2 is the cell reference containing the Title of the document that has an extension (i.e. Sample.pdf); =RIGHT(A2,LEN(A2)-FIND(“^^”,SUBSTITUTE(A2,”.”,”^^”,LEN(A2)-LEN(SUBSTITUTE(A2,”.”,””))))); Result should be “pdf”; Eyeball all the results for sort by the results as many times you get “.pd” or “.xl”. Fix the title of these documents so you get a legit file extension
  • Build your PATHONCLIENT: Concatenate the contents of VERSIONDATA that you already built (C:ClientsAttachments0PA000002gpI3Mai) with the result of the file type you have determined (i.e. “.pdf”); Use concatenate function to give you the following result (i.e. =G2&”.”&H2…note the addition of period in quotes); C:ClientsAttachments0PA000002gpI3Mai.pdf
  • Data Loader 21.0 does not report on “File not Found” Errors. It gives you no error or success file record for this situation. Suggest using Data Loader 20.0 for loading of Content Version table
  • CSV’s are finicky. You cannot have more than 1 worksheet in a given .csv. Well, you can, but when you save, it will delete all but the first
  • VLOOKUP: Excellent function. You will need to use it.
  • FIXID: If you happen to have 15 char ids, you can use Excel function =FIXID(A2) where A2 contains the 15 char id

For more tips & tricks for how to manage data in salesforce and AIM, visit the Altvia Community.