<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[William's Research]]></title><description><![CDATA[Thoughts, stories and ideas.]]></description><link>https://williamandrewgriffin.com/</link><image><url>https://williamandrewgriffin.com/favicon.png</url><title>William&apos;s Research</title><link>https://williamandrewgriffin.com/</link></image><generator>Ghost 3.14</generator><lastBuildDate>Thu, 12 Mar 2026 19:10:04 GMT</lastBuildDate><atom:link href="https://williamandrewgriffin.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[How to Download from Azure Blob Storage with Streams (using Express)]]></title><description><![CDATA[While working on our React Native app we needed a simple way to upload and download images. Azure Blob Storage was a natural fit for our image store.]]></description><link>https://williamandrewgriffin.com/how-to-download-from-azure-blob-storage-with-streams-using-express/</link><guid isPermaLink="false">5ec052972ec217261b7e810f</guid><category><![CDATA[Software Development]]></category><category><![CDATA[Azure]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[App Age Technologies]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[React Native]]></category><dc:creator><![CDATA[William Andrew Griffin]]></dc:creator><pubDate>Tue, 16 Jun 2020 13:51:40 GMT</pubDate><media:content url="https://williamandrewgriffin.com/content/images/2020/06/ryan-lara-CI2WQOU1Isc-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://williamandrewgriffin.com/content/images/2020/06/ryan-lara-CI2WQOU1Isc-unsplash.jpg" alt="How to Download from Azure Blob Storage with Streams (using Express)"><p>While working on our React Native app–<a href="https://mender.app/">Mender</a>–we needed a simple way to upload and download images. Our backend is hosted on Azure using Node.js and Express, so Azure Blob Storage was a natural fit for our image store.</p><p>A lot of great articles exist explaining how to upload to Azure Blob Storage. Unfortunately, few do a good job explaining the details necessary for downloads. Even <a href="https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-nodejs">Azure's documentation</a> leaves a lot to be desired. The majority of the articles provide steps to download <em>blobs</em> directly to the filesystem. While this works in some use cases, we needed something that could easily transport images without having to write it to disk or load the entire image in memory before sending it to our mobile app.</p><h2 id="that-is-where-streams-come-in-">That is where streams come in...</h2><p>Streams allow us to read and write data in chunks, passing them to another process or system as we receive or send them. This makes our application memory efficient (only handling a small portion of the data at a time) and speed efficient (since there is no disk I/O or waiting for the data to be fully collected in memory).</p><p>Passing the data to another process, or <strong>piping</strong> it, we are able to continue using the benefit of streams further down in our application process. In our case, Mender is able to read chunks of image data from our backend, which itself is reading data chunks from Azure.</p><!--kg-card-begin: markdown--><p><em>For a more detailed introduction to streams, check out Liz Parody's article, <a href="https://nodesource.com/blog/understanding-streams-in-nodejs/">Understanding Streams in Node.js</a>.</em></p>
<!--kg-card-end: markdown--><h2 id="implementing-the-download-process">Implementing the Download Process</h2><p>Over the next few sections, we will take you through the components we use on our Express server and React Native application to download images.  The application process flow is:</p><ul><li>React Native app sends an HTTP <code>get</code> request to an endpoint on our backend server (requesting the specific image file)</li><li>Server uses the request parameters to start the image file download from Azure</li><li>Server pipes the download back to the HTTP response of the original <code>get</code></li><li>React Native app processes the image file</li></ul><p>Let's start with the code we use on our Express server and then move to the client side, our React Native application.</p><h3 id="connecting-to-azure">Connecting to Azure</h3><p>To interface with Azure, we use their <a href="https://www.npmjs.com/package/@azure/storage-blob">@azure/storage-blob</a> client. We created an object with properties that handle the interactions we want to perform: delete, download, upload. These properties store asynchronous functions that manage details needed by the storage-blob client. Our object is used on the Express server, so we are able to securely load our Azure connection string as an environment variable.</p><p>The <code>download</code> function takes in the container and filename for the specific image. It returns a Promise that resolves to a data object that contains a Readable stream:<em> readableStreamBody</em>.</p><!--kg-card-begin: markdown--><pre><code class="language-js">const { BlobServiceClient } = require('@azure/storage-blob');

const azureBlob = {
    const AZURE_STORAGE_CONNECTION_STRING = process.env.AZURE_STORAGE_CONNECTION_STRING;
    delete: async (containerName, fileName) =&gt; {
    },
    download: async (containerName, fileName) =&gt; {
        // Create the BlobServiceClient object which will be used to create a container client
        const blobServiceClient = await BlobServiceClient.fromConnectionString(AZURE_STORAGE_CONNECTION_STRING);
        // Get a reference to a container
        const containerClient = await blobServiceClient.getContainerClient(containerName);
        // Get a block blob client
        const blockBlobClient = containerClient.getBlockBlobClient(fileName);
        return blockBlobClient.download(0);
    },
    upload: async (containerName, fileName) =&gt; {
    },
}

module.exports = {
    azureBlob
};
</code></pre>
<!--kg-card-end: markdown--><h3 id="triggering-the-download-function">Triggering the Download Function</h3><p>To trigger the download, we use <a href="http://expressjs.com/en/5x/api.html#router">Express Router</a>'s <code>get</code> method. Our React Native application sends a <code>get</code> request to our <code>/api/avatar/:container/:imgUrl</code> endpoint on our backend server. We use the route parameters <em>container </em>and <em>imgUrl,</em> of the request URL, to specify the blob container and filename. These string variables are fed in to our <code>azureBlob.download(container, fileName)</code> function we presented in the previous section.</p><!--kg-card-begin: markdown--><pre><code class="language-js">const router = require('express').Router();
const { azureBlob } = require('../azure');

// Get route for avatars
router.get('/avatar/:container/:imgUrl', (req, res, next) =&gt; {
    // Start blob download from Azure
    azureBlob.download(req.params.container, req.params.imgUrl)
        // Pipe download stream to response
        .then(downloadBlockBlobResponse =&gt; 
            downloadBlockBlobResponse.readableStreamBody.pipe(res))
        .catch(error =&gt; {
            next(error);
        })
})

module.exports = router;</code></pre>
<!--kg-card-end: markdown--><h2 id="the-trick">The Trick</h2><!--kg-card-begin: markdown--><p>This line of code is the how we are able to benefit from streams:</p>
<pre><code class="language-js">azureBlob.download(req.params.container, req.params.imgUrl)
        .then(downloadBlockBlobResponse =&gt;
            downloadBlockBlobResponse.readableStreamBody.pipe(res))
</code></pre>
<!--kg-card-end: markdown--><p>When we receive the <code>blockBlobClient.download(0)</code> object from our <code>azureBlob.download</code> function, we are receiving a <em>Promise </em>that contains a readable stream of data. Using <strong>.pipe(destination)</strong>, we are then able to send these incoming chunks of data from Azure to our React Native application using the HTTP response object, <em>res</em>.</p><!--kg-card-begin: markdown--><p><img src="https://williamandrewgriffin.com/content/images/2020/06/day49-angler-fish.png#left" alt="How to Download from Azure Blob Storage with Streams (using Express)"><br>
– <a href="https://nodejs.org/api/stream.html#stream_readable_pipe_destination_options">Node.js official documentation</a> –<br>
The <em>readable.pipe()</em> method attaches a <em>Writable</em> stream to the readable, causing it to switch automatically into flowing mode and push all of its data to the attached <em>Writable</em>. The flow of data will be automatically managed so that the destination <em>Writable</em> stream is not overwhelmed by a faster <em>Readable</em> stream.</p>
<!--kg-card-end: markdown--><h3 id="starting-the-download-from-react-native">Starting the Download from React Native</h3><p>When the app needs to update a user image, for example when a user changes their avatar, it starts an asynchronous download. Using Expo for our React Native development, we are able to use their <a href="https://docs.expo.io/versions/latest/sdk/filesystem/">FileSystem</a> component to handle the data stream coming from our Express server. (A side note on Expo: it is powerful development and build tool that vastly streamlines these processes across Android and iOS.) The application starts the download with two variables: the remote URI to download and the local URI of the file to download to.</p><ol><li><strong>remote URI example: </strong>"https://expressserver.company.com/api/img/avatar/avatars/photo-5ea10a1f3a7b30e38143ec00.jpeg"</li><li><strong>local URI example:</strong> "file://avatars/photo-5ea10a1f3a7b30e38143ec00.jpeg"</li></ol><p>During the image upload process, the application selects a random, unique filename and the respective container associated with the image as it appears on Azure Blob Storage. This URI string is stored in mongoDB as the<em> imgUri</em>. To download an image from Azure the application first queries mongoDB for the imgUri.</p><ul><li><strong>imgUri example:</strong> "avatars/photo-5ea10a1f3a7b30e38143ec00.jpeg"</li></ul><p>The following React Native function is used to start the download process. It takes two variables: the imgUri to download and the API endpoint on our Express server to use.</p><!--kg-card-begin: markdown--><pre><code class="language-js">import Constants from 'expo-constants';
import * as FileSystem from 'expo-file-system';

// Express server URL
const SERVER_URL = Constants.manifest.extra.SERVER_URL;

// Image download function
const downloadImg = async (imgUri, imgEndpoint) =&gt; {
    return FileSystem.downloadAsync(
      SERVER_URL + imgEndpoint + imgUri,
      FileSystem.documentDirectory + imgUri
    )
    .then(({ uri }) =&gt; {
      console.log('Finished downloading to ', uri);
      return uri
    })
    .catch(error =&gt; console.log(error))
};</code></pre>
<!--kg-card-end: markdown--><p>The FileSystem component handles the collection of the data chunks coming from our Express server stream, saving then to a specified location. By using the imgUri with the FileSystem.document directory, we are telling the downloadAsync function to use the container as the folder for the image. In the example above, the application would download to "file://avatars" folder with "photo-5ea10a1f3a7b30e38143ec00.jpeg" as the filename.</p><p>And that is it. We have used streams in Express to pipe data flowing from Azure Blog Storage to a local file on our mobile device. A lot more can be done with streams, so I highly recommend spending time learning more about them. If you have any questions or need some advice on your Node.js project, please feel free to reach out to us at<a href="mailto:info@appagetech.com"> info@appagetech.com</a>.</p>]]></content:encoded></item><item><title><![CDATA[Best Way to Deploy spaCy to Azure Functions]]></title><description><![CDATA[Last year we partnered with a FinTech firm that wanted to implement a new natural language processing (NLP) model and transition to a new cloud provider.]]></description><link>https://williamandrewgriffin.com/best-way-to-deploy-spacy-to-azure-functions/</link><guid isPermaLink="false">5eb4d32d8c2dd3506858e111</guid><category><![CDATA[Software Development]]></category><category><![CDATA[Azure]]></category><category><![CDATA[Python]]></category><category><![CDATA[App Age Technologies]]></category><dc:creator><![CDATA[William Andrew Griffin]]></dc:creator><pubDate>Mon, 11 May 2020 20:01:11 GMT</pubDate><media:content url="https://williamandrewgriffin.com/content/images/2020/05/greg-rakozy-oMpAz-DN-9I-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://williamandrewgriffin.com/content/images/2020/05/greg-rakozy-oMpAz-DN-9I-unsplash.jpg" alt="Best Way to Deploy spaCy to Azure Functions"><p>Last year we partnered with a FinTech firm that wanted to implement a new natural language processing (NLP) model and transition to a new cloud provider. To accomplish this, we decided to use <a href="https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference-python">Azure Functions</a> for cloud hosting and <a href="https://spacy.io/">spaCy</a> as the foundation for the NLP model. Our initial builds only ran into minor configuration issues. However, soon as our application began to grow, we hit some major issues, including: unstable builds, crashing CI/CD pipelines, failed deployments. With these issues hindering our progress, it was time to take the system apart to determine the cause of the problems. Everything was working perfectly—except spaCy. This was a major issue, however, as spaCy was key to this project: best results versus competing machine learning tools and calibrated to the client’s use case. </p><p>After many frustrating hours, and what proved to be numerous trials and, ultimately, errors, we discovered the best way to deploy a large spaCy model to Azure functions: manually use the model data directory as a part of the application’s repository.</p><h2 id="tl-dr">TL;DR</h2><ul><li><strong>Download and extract: </strong> <code>en_core_web_lg-2.2.5.tar.gz</code> </li><li><strong>Copy model data directory to your app folder: </strong> <code>__app__</code></li><li><strong>Import SpaCy into your function file: </strong> <code>import spacy</code></li><li><strong>Load the model using: </strong> <code>nlp = spacy.load('&lt;filepath&gt;/en_core_web_lg-2.2.5')</code></li></ul><h3 id="background">Background</h3><p><strong>Why did we choose spaCy?</strong> spaCy is a production ready, incredibly fast and accurate, NLP tool, especially when using their large general-purpose models. We were able to conduct accurate part-of-speech tagging and noun chunking out of the box. From there, we developed a propriety processing algorithm to consume spaCy’s Dependency Parser. This was tailored to the specific type of text documents that the client was analyzing. All in all, we ended up with a very clean and representative collection of data.</p><p><strong>Why did we choose Azure Functions?</strong> Since spaCy is just one part of our Azure Function that handles data processing, we needed a system without cold start lag and had the ability to dynamically scale. The capabilities of Azure Functions and a dedicated App Service plan worked well with what we were trying to accomplish versus AWS Lambda. A dedicated App Service plan allowed us to run the Function application like a traditional web app. The plan uses dedicated scalable memory and compute. So, we were able to keep the application in memory, including the large spaCy model, allowing hot starts. By hosting on Azure Functions, our app is cached in memory, has unbounded timeout duration, flexible memory usage (up to 14 GB), and flexible compute.</p><h3 id="different-methods-we-tried">Different Methods We Tried</h3><h4 id="using-the-recommended-download-method">Using the recommended download method</h4><p>We tried to incorporate the model download as one of the build steps. The benefit of this would have been a smaller repo and access to the most recent model on each update. However, due to the size of the download, our builds began to randomly fail. At the same, our build time was exploding, making it impossible to iterate quickly. Builds were averaging 20+ minutes.</p><pre><code class="language-bash">$ python -m spacy download en_core_web_sm

&gt;&gt;&gt; import spacy
&gt;&gt;&gt; nlp = spacy.load("en_core_web_sm")</code></pre><h4 id="installation-via-pip">Installation via pip</h4><p>Similar to the previous method, this would give us a smaller repo, but we would have lost the ability to dynamically use the most recent model. Nonetheless, we needed a stable build and deployment pipeline. So, we incorporated the model's external URL as a part of our <code>requirements.txt</code> . Our team now had luck on the build step; however, the deployment started to randomly fail. Builds saw a minor improvement averaging less than 10 minutes, yet the deployments started taking 15 to 20 minutes, if they worked.</p><pre><code class="language-shell"># With external URL
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.2.5/en_core_web_lg-2.2.5.tar.gz
</code></pre><h3 id="our-final-implementation-wooooo-it-is-working">Our Final Implementation! Wooooo, it is working</h3><p>Our last ditch effort was to manually download and extract the language model as a part of our application's repo. This meant drastically increasing the repo size. It also meant that we would have to manually replace the model when a new one became available. With nothing else working, we considered our options and decided these downsides were manageable for us.</p><p>We used the most recent large English model available at the time <code>en_core_web_lg-2.2.5</code>. After downloading and extracting the archive, we saved the model directory in our app folder.</p><!--kg-card-begin: markdown--><p><img src="https://williamandrewgriffin.com/content/images/2020/05/SpaCy-Model-in-__app__-Folder.png#right" alt="Best Way to Deploy spaCy to Azure Functions"></p>
<!--kg-card-end: markdown--><p>To use the model in our function, we loaded the model directly. For us this was in the <em>parent</em> <em>director</em> of our current utility function. Below is our code.</p><pre><code class="language-python">import spacy

def get_spacy_path():     
    current_path = pathlib.Path(__file__).parent.parent    
    return str(current_path / 'en_core_web_lg-2.2.5')
print(get_spacy_path())
nlp = spacy.load(get_spacy_path())</code></pre><h4 id="infrastructure">Infrastructure</h4><ul><li><strong>App Service:</strong> B2 (3.5 GB memory; 1:1 vCPU:Core)</li><li><strong>Git repo:</strong> Azure Repos</li><li><strong>Azure Function: </strong>Python Function App version 2.0</li><li><strong>Python: </strong>version 3.7</li><li><strong>CI/CD: </strong>Azure DevOps Pipelines (YAML deployment)</li></ul><h2 id="closing-thoughts">Closing Thoughts</h2><p>Our approach was not ideal. The expectations for how these technologies could/should be combined were not close to reality. We could have saved ourselves a lot of pain if we tested various deployment methods upfront. Our first priority was testing and building the application—DevOps was secondary. As we approach new projects, our testing framework will include production deployment, treating it as equally as important as the application code itself from day one.</p><hr><h2 id="final-deployment-yaml">Final Deployment YAML</h2><pre><code class="language-YAML"># Production ----- Python Function App to Linux on Azure
# Build a Python function app and deploy it to Azure as a Linux function app.
# Add steps that analyze code, save build artifacts, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/python

trigger:
- master

variables:
  # Azure Resource Manager connection created during pipeline creation
  azureSubscription: &lt;subscription&gt;

  # Function app name
  functionAppName: &lt;appName&gt;

  # Agent VM image name
  vmImageName: 'ubuntu-latest'

  # Working Directory
  workingDirectory: '$(System.DefaultWorkingDirectory)/__app__'

stages:
- stage: Build
  displayName: Build stage

  jobs:
  - job: Build
    displayName: Build
    pool:
      vmImage: $(vmImageName)

    steps:
    - bash: |
        if [ -f extensions.csproj ]
        then
            dotnet build extensions.csproj --runtime ubuntu.16.04-x64 --output ./bin
        fi
      workingDirectory: $(workingDirectory)
      displayName: 'Build extensions'

    - task: UsePythonVersion@0
      displayName: 'Use Python 3.7'
      inputs:
        versionSpec: 3.7

    - bash: |
        pip install --upgrade pip
        pip install -t .python_packages/lib/site-packages -r requirements.txt
      workingDirectory: $(workingDirectory)
      displayName: 'Install application dependencies'

    - task: ArchiveFiles@2
      displayName: 'Archive files'
      inputs:
        rootFolderOrFile: '$(workingDirectory)'
        includeRootFolder: false
        archiveType: zip
        archiveFile: $(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip
        replaceExistingArchive: true

    - publish: $(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip
      artifact: drop

- stage: Deploy
  displayName: Deploy stage
  dependsOn: Build
  condition: succeeded()

  jobs:
  - deployment: Deploy
    displayName: Deploy
    environment: 'production'
    pool:
      vmImage: $(vmImageName)

    strategy:
      runOnce:
        deploy:

          steps:
          - task: AzureFunctionApp@1
            displayName: 'Azure functions app deploy'
            inputs:
              azureSubscription: '$(azureSubscription)'
              appType: functionAppLinux
              appName: $(functionAppName)
              package: '$(Pipeline.Workspace)/drop/$(Build.BuildId).zip'</code></pre>]]></content:encoded></item></channel></rss>