🚩 Introduction
In a recent project, we needed to transcribe documents into HTML format within Sitefinity, making them editable directly via the Content Block widget.
Sitefinity uses Content Blocks to manage rich text and HTML content. To extend its functionality and implement a custom document transcription feature, we needed to create a custom extension for the Sitefinity AdminApp.
💡 Planning the Approach
After reviewing the available resources, we found a relevant AdminApp Extension Sample Repository.
The sample that closely matched our goal was the Sitefinity Insert Videos Extension.
While researching, we also discovered there are paid products available for document transcription. However, for flexibility and full control, we decided to build our own solution using Aspose.Words for .NET.
Our plan was:
- Load the uploaded document in Sitefinity
- Transcribe it into HTML format using Aspose.Words
- Embed the resulting HTML into the Content Block editor
⚡ Execution Strategy
Since the AdminApp is an Angular application, we needed an API endpoint that:
- Accepts a document ID
- Loads the document from Sitefinity
- Converts it to HTML
For the server-side API:
- We loaded the document as a stream
- Converted the document to HTML, configuring it to export images as base64 strings for embedded display
On the AdminApp side:
- We cloned the Insert Videos extension for structure
- However, we discovered that there was no built-in method for opening a Document Library picker, unlike the image or video selectors
🔍 Workaround for Document Selection
We utilized the built-in insertDocument
functionality to allow document selection.
From our earlier research (reference article),
we reused this method to access the Kendo Editor:
const editor = editorHost.getKendoEditor();
const editorValue = editor.value();
We applied a simple RegEx to extract the selected document’sID from the inserted link.
The API then:
- Retrieved the document using Sitefinity’s LibrariesManager API
- Used Aspose.Words to transcribe the document into HTML
- Returned the resulting HTML to the AdminApp
Finally, we updated the Content Block editor:
editor.value(html);
editor.trigger('change');
This process enabled seamless insertion of transcribed, editable HTML content into the page!
📱 Final Result
Here’show it looks once transcribed into editable HTML:
🚀 New Challenge: Base64 Images Not Displaying
After transcription, images encoded in base64 were not displaying correctly in the Content Block.
We suspected Sitefinity’sHTML sanitizer was stripping out the data URLs.
Upon further investigation, we realized the issue was with the allowed URL schemes.
💡 Solution: Extending the HTML Sanitizer
We created a custom sanitizer by allowing the data:
scheme:
public class SitefinityExtendedHtmlSanitizer : HtmlSanitizer
{
public SitefinityExtendedHtmlSanitizer()
{
base.AllowedSchemes.Add("data");
}
}
This resolved the issue, allowing base64 images to render correctly in the Content Block.
🎯 Challenge Resolved!
Here’s the working screen after base64 images are properly rendered:
📚 References
- Sitefinity – Persist base64 images
- Sitefinity – Modify the HTML sanitizer configuration
- Aspose – Words for .NET
📦 Code Repository
This tutorial is taken from hawjeh blog.