[et_pb_section bb_built="1" _builder_version="3.17.6" custom_padding="0px||0px"][et_pb_row][et_pb_column type="4_4"][et_pb_text admin_label="The Challenge" _builder_version="3.17.6"]

The Challenge

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row custom_padding="7px||7px" custom_margin="0px||" _builder_version="3.17.6" module_class_1="ds-vertical-align" module_class_2="ds-vertical-align"][et_pb_column type="4_4"][et_pb_text _builder_version="3.17.6"]

[text-blocks id="requirements"]A client recently approached us with a data science challenge regarding one of their data sets. The data was provided to the client in an AWS environment in a Redshift data warehouse. While this was fast they found it to be very expensive, in AWS the data and compute costs are coupled together. As such, a large data set necessitates a high spend on computing costs, even if this level of speed is not necessary for their analysts.

However, the data was also available in CSV format in an S3 storage bucket, which could be the starting point of a new approach. The client already had all their infrastructure deployed and managed by Hentsū in Azure, so they wanted to consolidate into the existing infrastructure. 

After reviewing the challenges, we were able to create an elegant solution leveraging the huge power and scale of the cloud, which is simply not possible in traditional infrastructure.

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row][et_pb_column type="4_4"][et_pb_text admin_label="Key Considerations" _builder_version="3.17.6"]

Key Considerations

  • The solution had to be able to process this large data set consisting of over 11,000 files and a total compressed size of ~2TB, with additional files every day.
  • Raw files had to be stored for any future needs, whilst also being ingested into a database.
  • The ingestion should both be parallelisable and rate controlled, to ensure we manage the number of database connections and have orderly ingestion.
  • Not only was this to be a one-time load of historical data, but new files created needed downloading and ingesting in an automated fashion.
  • Every file had to be accounted for to ensure that all the data is moved correctly, so keeping track of each file's status was important. Things happen; connections break, processes stop working, so we must have a system in place when these do occur.
  • Keep ongoing maintenance low effort, cost-efficient and automated, and delegate as much of the maintenance away from end-users.
[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row][et_pb_column type="4_4"][et_pb_text admin_label="The Solution" _builder_version="3.17.6"]

The Solution

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row custom_padding="7px||7px" custom_margin="0px||" _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_text _builder_version="3.17.6"]

[text-blocks id="technologies-used" align="right"]Hentsū  recommended a solution built on Azure Data Factory (ADF), Microsoft's Extract-Transform-Load (ETL) solution for Azure. While there are many ETL solutions that can run on any infrastructure, this is very much a native Azure service and easily ties into the other services Microsoft offers.

The key functionality is the ability to define the pipelines to move the data in a web user interface, set the schedules which can either be event based (such as a creation of a new file) or on a time schedule, and then Azure handles the execution of the pipelines to process the data. The pipeline creation requires relatively little coding experience so it makes it easy to delegate this to staff with little technical experience.

 

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row][et_pb_column type="4_4"][et_pb_text admin_label="Technical Details" _builder_version="3.17.6"]

Technical Details

Hentsū built out the data pipelines to move the data from AWS into Azure. The initial load was triggered manually, but then the update schedules were set to check for new files at regular intervals.

Hentsū created status tables to keep track of each file. This allows us to keep track of the state of the data as it passes through the pipelines and use a decoupled structure so that any troubleshooting or manual intervention can happen at any stage of the process without creating dependencies. The decoupled structure meant that individual files and steps can be fixed in isolation, and then the rest of the pipelines and steps continue uninterrupted. The clean decoupling means any errors on a particular step were easily identified and notified to users for investigation.

All the data was then mapped back to these tables, to be used if we ever needed to do further processing or cleaning on the final tables. The data was further transformed with additional schema changes to match the client's end use and to map it to the traditional trading data.

The pipelines were deliberately abstracted to allow for the least amount of work to add new data sources in the future. The goal was to make it easy for the client's end users to do themselves as and when required.

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_text admin_label="Benefits & Caveats" _builder_version="3.17.6"]

The Benefits of Azure Data Factory

ADF can run completely within Azure as a native serverless solution. This means there is no need to worry about where the pipelines are run, what instance types to choose upfront, manage any servers/operating systems, configure networking, and so on. The definitions and schedules are simply set up and then the execution is handled.

Running as a serverless solution means true "utility computing", which is the entire premise of cloud platforms such as Azure, AWS, and Google. The client only pays for what is used, there are no times with idle servers costing money without producing anything, and it can scale up as needed.

ADF also allows the use of parallelism while keeping your costs to only what is used. This scaling up was a huge benefit of ADF for the client and when time is of the essence; one server for 100 hours or 100 servers for one hour cost the same, but the work is done in 1/100th of the time. Hentsū tuned the solution so the speed of the initial load was only restricted by the power of the database, allowing the client to balance the trade-off between speed and cost.

ADF has some programming functionality, such as loops, waits, and parameters for the whole pipeline. Although there is not as much flexibility as a full language (Python for example) it allowed Hentsū significant flexibility to design the workflows.

Caveats

There are limited sources and sinks (i.e. inputs and outputs). The full list is available in the Microsoft documentation. Microsoft's goal with ADF is to get data into Azure products, so if one needs to move data into another cloud provider a different solution is needed.

The pipelines are written in their own proprietary "language", which means the pipelines code does not integrate well with anything else, which would not be the case if they were written in a language like Python, as many other ETL tools will provide. This is also the key reason we have developed our own ETL platform for more complex solutions which uses Docker and more portable Python code.

There were some usability issues when creating the pipelines, with confusing UI or vague errors on occasion; however, these were not showstoppers. Our advice when using the ADF UI is to make small changes and save often. We can see that Microsoft is already aggressively addressing some of the issues we encountered.

Impact

The client was very pleased with the ADF and Azure SQL Data Warehouse solution. The solution automatically scales the compute power to process the data as it changes week by week, it scales up when there is more data, and scales down with less data. Overall the solution costs a fraction of what it did previously whilst keeping it all within the client's Azure environment.

[/et_pb_text][et_pb_cta title="Reach Out To Find Out How We Can Support Your Data Science Needs" button_url="https://hentsuprod.wpengine.com/contact" button_text="Contact Us" _builder_version="3.17.6"]

[/et_pb_cta][/et_pb_column][/et_pb_row][/et_pb_section]

Date/Time

Date(s) - 01/01/1970
12:00 AM - 12:00 AM

Location

600 5th ave. NY, NY
[et_pb_section bb_built="1" _builder_version="3.17.6" custom_padding="0px||0px"][et_pb_row _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_text _builder_version="3.17.6"]

Microsoft recently had a flurry of announcements about Office 365 and especially Microsoft Teams. Below, we highlight  some of the key changes important to the asset management space. 

Microsoft: Now Available 

Outlook on the web - Conditional Access 

Office 365 can now set up policies that block users from downloading files from Outlook on the web to non-compliant devices. This helps provide more flexibility on the go, but still retains a good degree of security around your company files. 

Azure AD Password Protection 

Azure AD Password Protection helps you eliminate easily guessed passwords from your environment, which can dramatically lower the risk of being compromised by a password spray attack. Specifically, these features let you:  

  • Protect accounts in Azure AD and Windows Server Active Directory by preventing users from using passwords from a list of more than 500 of the most commonly used passwords, plus over 1-million character substitution variations of those passwords.  
  • Manage Azure AD Password Protection for Azure AD and on-premises Windows Server Active Directory from a unified admin console. 

Update to Exchange Mailbox Auditing – Mailboxes Audited by Default and New Mailbox Actions to Audit 

To ensure clients have access to critical audit data to investigate security or regulatory incidents in their tenancy when required, the Exchange Online service introduces a configuration that automatically enables mailbox auditing on all applicable mailboxes to users of the Commercial service. With this update, it is no longer required to configure the per-mailbox audit setting for the service to begin storing security audit data. These actions are of high interest to understand the activities that are taking place within the tenant. 

Combined Password Reset & MFA Registration 

Microsoft released a preview of a new user experience that allows users to register security info for multi-factor authentication (MFA) and password reset in a single experience. Now when a user registers security info such as their phone number for receiving verification codes, that number can also be used for resetting a password. Likewise, users can change or delete their security info from a single page, making it easier to keep information up-to-date. 

Outlook Calendar: Option to Block Forwarding of Meeting Invites 

Meeting organizers have the option to prevent attendees from forwarding a meeting invitation. This option is available only for users in Office 365. In the first release, the option to prevent forwarding is available when creating or editing meetings in Outlook on the web, but the option will become available in Outlook for Windows shortly after. 

In Development: To Keep an Eye On 

Admin tool: TeamSite Auto-Mount 

Admins can specify TeamSite Libraries that they want their users to automatically sync with OneDrive for Business. 

Passwordless Sign-in for Work Accounts 

Microsoft Authenticator mobile app now supports sign-in with your face/fingerprint or device PIN to your work accounts. You can take out the security risk of passwords and have the convenience of using a device you already own and carry with you. This option can be configured by administrators in the Azure Active Directory. 

For more Information on the latest Microsoft updates check out the roadmap here.

 

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_cta title="Contact Us" button_text="Click Here" _builder_version="3.17.6" button_url="https://hentsuprod.wpengine.com/contact"]

To learn more about how we can support you with these updates and more, contact us today. 

[/et_pb_cta][/et_pb_column][/et_pb_row][/et_pb_section]

Date/Time

Date(s) - 01/01/1970
12:00 AM - 12:00 AM

Location

600 5th ave. NY, NY
[et_pb_section bb_built="1" _builder_version="3.17.6" custom_padding="0px||0px"][et_pb_row _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_text _builder_version="3.17.6"]

We have provided cloud solutions to asset managers for the past three years and in this time completed various types of email migrations to Office 365. These migrations include a mix of moving clients from an on-premise Exchange or a third-party legacy private cloud provider entirely to Office 365 to working with hybrid solutions that span both own on-premise and Office 365. 

During these migrations we noticed a range of issues with clients who opted to set up their Office 365 accounts via a more economical re-seller or through bundled packages with other services. 

Here are a few things to be wary of when setting up your Office 365 accounts with the wrong partner:

  1. Some of these providers offer what is called a “Syndication Tenant”. Microsoft retired this type of subscription but it is still offered by many existing re-sellers. With a Syndication Tenant agreement, the Office 365 account, Azure AD tenant and data is held by the re-seller and can’t be easily migrated away. In this setup, a multiple step process is required to hand over the account to another partner. The data needs to be backed up, the account deleted, and the data re-imported into a new account. All of this means extra complexity, user upheaval and extended downtime. 
  2. Some re-sellers, especially the syndicated tenant providers, do not offer the account holder true admin rights which means only a subset of the Office 365 functionality and management is available. 
  3. You could end up locked into a strict contract when negotiating your agreement. Sometimes your contract could last up to two or three years with no variations possible on the user services and license counts. 
  4. Security options are limited when compared to native Office 365 solutions or conditional access policies.
  5. Interface solutions from these re-sellers often lack basic functionality such as single sign on tools. 

With issues like these it is important to do your due diligence when exploring your options before committing. More likely than not you will find that your safest and most efficient option is to partner with a trusted and experienced service provider or go to Microsoft directly.

How Hentsū does it differently

We are a Tier 1 Microsoft Cloud Service Provider (CSP) and work directly with Microsoft. We are also a Silver partner and specialists in the asset management industry. All of this allows us to provide a range of flexible solutions tailored to the world of fund management. 

A good time to reach out to us is when your fund is about to be registered (SEC, FCA, etc). We know the industry requirements and can provide guidance on best practices and compliance. We also have the ability to work earlier with startups to ensure that they have all the tools in place from day one and can scale as they grow. 

Generally, we advise to take the following steps when setting up cloud services: 

  • Create a native Office 365 account through Microsoft directly, or use one of the Hentsū starter packages. We create client accounts directly with Microsoft so you hold the keys to the Azure AD tenant.  Your data is always your data so you can migrate to another provider at any time.  
  • Validate that you hold ownership over your Office 365 account and email domain. 
  • Purchase license subscriptions and set up users and groups.
  • Don’t go for 12 month commitments until you are sure of which services you actually need. We offer all of our clients the same 12 month discounts but on a monthly rolling basis. 
  • Set up data loss prevention and data retention policies and be aware of two factor authentication and mobile device security. We enable all these features by default as you on-board to our setup.

So be sure to carefully consider all the possibilities before signing on with Office 365 re-sellers or bundled solutions, as there are a range of options for your Office 365 needsIf you are unsure of where to go next for your Office 365 solutions, reach out today to learn how we can best support you. 

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row _builder_version="3.17.6"][et_pb_column type="4_4"][et_pb_cta title="Talk to us about your Office 365 needs" button_url="https://hentsuprod.wpengine.com/contact" button_text="Contact Us Today" _builder_version="3.17.6"]

[/et_pb_cta][/et_pb_column][/et_pb_row][/et_pb_section]

Date/Time

Date(s) - 01/01/1970
12:00 AM - 12:00 AM

Location

600 5th ave. NY, NY
Marko Djukic, CEO and founder of Hentsū, reflects on the advantages of the public cloud in asset management, the enhanced security it delivers and the evolution of data science it enables. Read the piece to learn more: The Power of the Public Cloud

Date/Time

Date(s) - 01/01/1970
12:00 AM - 12:00 AM

Location

600 5th ave. NY, NY

AWS now in the UK

Hentsu deploying to AWS UK London cloudToday, Amazon announced the launch of AWS data centres in London. In addition to the existing Dublin and Frankfurt locations, AWS now has a third European region choice, and one which is especially useful to UK customers. As usual for AWS regions, this UK location comes with multiple levels of resiliency and comprises of two data centres.

Benefits

But what does this mean for asset managers using AWS technology and Hentsū customers in the UK?
  • Lower latency to your UK offices and data centres
  • Lower latency to other financial institutions, exchanges, and market data providers
  • If there are any specific requirements around UK jurisdiction, this is now easily mandated as part of the deployment
  • Connectivity is now available as a cross connect from multiple points of presence in London
Hentsū is now deploying to AWS London. For existing customers we can redeploy your existing environments to the UK region, with the minimum downtime.

More Information

The AWS Europe (London) Region offers two Availability Zones at launch. AWS Regions are comprised of Availability Zones, which refer to technology infrastructure in separate and distinct geographic locations with enough distance to significantly reduce the risk of a single event impacting availability, yet near enough for business continuity applications that require rapid failover. Each Availability Zone has independent power, cooling, physical security, and is connected via redundant, ultra-low-latency networks. AWS customers focused on high availability can architect their applications to run in multiple Availability Zones to achieve even higher fault-tolerance. AWS also provides multiple Amazon CloudFront edge locations in the UK for customers looking to deliver websites, applications, and content to UK end users with low latency. These locations are part of AWS’s existing network of 68 edge sites across North and South America, Europe, Asia, and Australia. See the full press release here. Contact us for more information on how you can take advantage of the new UK London AWS region.

Date/Time

Date(s) - 01/01/1970
12:00 AM - 12:00 AM

Location

600 5th ave. NY, NY