Skip to main content

Case study: Dock

Managing developers access to production for a fintech unicorn How a fintech unicorn is using Runops to:

  • reduce time to resolution of support tickets;
  • and increase feature delivery speed

by removing developers dependency on manual work from the security and DevOps teams.

Company#

Dock is one of the largest fintech startups in the world. They almost 300 million USD. The company has 1500 people. Almost 1 thousand people work in technology. Dock provides banking, issuing, acquiring, and many other components of the financial stack as a service. One could build just any financial application on top of Dock APIs.

Scenario#

By operating in a highly regulated industry they need a lot of controls in production. Besides caring a lot about their users data, they also have to comply with industry standards to operate in the financial industry.

PCI is one of the required certification to operate in their industry. The impacts to technology teams are huge. Things that traditionally work for non-regulated businesses break for fintechs.

One such example is managing access to production databases and containers. Startups from many industries don't care about this problem at all. Developers have full raw access to production.

Many large tech-media companies are known for their speed, and a lot of it comes from their developers' autonomy. This very fact contributed to the large data issues coming out of the tech giants. The impacts their businesses were big. But they are still in business.

This won't happen in the financial industry. Developers autonomy is key for productivity, but it is hard to achieve in a highly regulated industry. This is a critical need for Dock as a disrupting fintech that requires speed.

Problem#

The lack of solutions to help with the problem created manual work. They lost speed to ensure systems were safe. Developers that needed access to production had to use a gated process implemented by the security team.

Engineers had to open tickets with SQL queries using Jira. The security team validated the query for:

  • Reliability
    • Ensure the queries weren't destructive to the production environment.
    • Bad where clauses and anything that could disrupt production had to be evaluated before touching production to reduce changes of an outage.
  • Security
    • Ensure the data access is clear of any PII data.
    • When PII is required, validate the need and validate the least possible exposure.
  • Compliance
    • Leave records of everything throughout the process.
    • Who, what, when, where, why, had to be answered for everything that touched production.

After all the the checks, the sec team had to manually:

  1. Open the VPN
  2. Access the database
  3. Open Jira to copy the query
  4. Run the query on the database
  5. Validate execution and results
  6. Report back to the developer

They had more than 50 requests a week.

Multiple people in the security team had to work full time on solving these tickets. High impact security engineers working on this process were spending all their time doing manual copy and pasting of SQL scripts. But developers were also sad.

Due to the large amount of requests tickets would take from hours to days to execute. This delay generated a lot of context switching and broke developers productivity

In the end the, company was looking at impacts of:

Lost investments#

  • Idle developers blocked waiting on the security team
  • Security engineers working on low impact manual tasks full time

Slower growth#

  • Slower feature release slowed down growth.
  • Features needed to grow the business by providing more value to customers were delayed.

Churn#

  • Slow execution of scripts impacted support requests resolution time.
  • The worst user experience also creates churn.

Team#

The Runops project was initiated by the SRE Team. They take care of the developer platform and collaborate with security

The security team saw the potential of Runops. Remove manual work and increase developer autonomy was a big desire. Getting to that while maintaining the compliance, reliability, and security requirements would be huge.

Thiago Mouro, one of the security engineers executing scripts manually, took on the challenge to automate the process using Runops.

Stack#

Dock is a Microservices powerhouse, with hundreds of databases and services running in the cloud. They use AWS-managed databases and services

Goal#

The goal of the project was to enable developers to execute their SQL scripts directly to the live database. Execution had to happen only after scripts got reviews by teams impacted by the changes. It would increase autonomy.

Developers would be in charge of ensuring executions went well. While the sec team performed security analysis. Drastically reduced scopes. The load of the security team would drop. Removing the need for sec to also run and and ensure it went well.

Solution#

They decided to use Runops gated access control. It enabled developers to create their scripts from Slack, web, or terminal. Jira was out of the way.

Runops Slack app was used for real-time review workflows. The team got review requests in Slack. Commands included who, is doing what, where, when, and why. Everything they needed. They made decisions in seconds.

Then came execution time. Runops execution runtime was used to send only approved commands to the live database. Centralized audit trails were automatically generated for compliance.

Everything getting SSO with Okta.

Systems replaced#

The new process replaced replaced:

  • VPN for production access
  • Static passwords for database users
  • Jira ticketing system
  • Manual copying and pasting of commands
  • Slack messages for coordinating priority of tickets

Business impacts#

Reduced time to resolve support tickets#

  • A result of production database access requests going from days to minutes.
  • In addition to increased customer satisfaction, developers got extra time to work on features.

More time for revenue-related features#

  • High-leverage projects out of backlog
  • The two people from the sec team working full time on access requests started working on strategic projects.
  • Automation projects that were long due got out of the backlog.

Reduced number of systems#

  • Replaced VPNs for database access, Jira ticketing, Slack messages for discussing executions

Faster compliance audits#

  • Runops SOC2 and PCI compliance reports replaced in-house Jira customizations.
  • No time spent on Jira settings during audits.
  • Reports built-in.

Extra time to work on strategic projects#

  • Competitive advantage for Dock by not spending valuable time on manual operations.
  • Impressive pace of speed and innovation despite their huge size and complexity of their technology.

Indirect results#

Improved M&A results#

  • Companies joining the team integrated Runops workflows in days and processes standardized across large organizations.

Promotions#

  • Runops was a super high visibility project as it improved the lives of every engineer in the company. The impact ended up resulting in promotions for champions of the project as they got a lot of appreciation from all areas of the company for the great results.

Thank you#

Thank you Jean Dias, Thiago Mouro, Renato Matos, and everyone that took part in the project. Your feedback to Runops was critical to getting us where we are today. We are proud of the partnership we created togheter. Let's make it better together for the years to come!

New Dashboard page for admins

We've just released the Dashboard page for admins on our Portal. 馃帀


Screenshot 2022-07-06 at 12 04 44#

It counts with:

  • An area where you can see queries running now;
  • An area with all queries waiting for review;
  • A list of total queries organised by their status, where you can filter by days;
  • And above it all, it's possible to visualise things by specific connections and users.

Next steps:

  • Watch for use and improve the current features;
  • Listen to users' feedback to understand more filters possibilities;
  • Study the next kind of data we want to show to develop on this page;
  • Fancy-pants graphs 馃帺馃搳.

Runops Releases #26

We moved a lot of furniture from place to place, and we found better locations for lots of stuff in our Portal. So get some water and get ready!

Starting with the new templates and reviews page. If you thought, "oh, nice, now they have a review and templates page!" Well, that's why we moved it there, it wasn't new, it was just a bit hidden.

Templates page#

templates

We introduced a new experience for Templates, and now you can see and execute your templates from this new page with much more control than before. This page shows all templates available for you and lets you run and see the results right away.

Reviews page#

reviews page

Not only have we moved it to a dedicated page, but we also let you know in advance if some of your colleagues need your review on something.

Users and agents#

settings

As a design decision, from now on (if you're an admin), you can find the page to manage your users and see your agents on the settings page.

Runops Releases #25

We are improving some visualizations and adding some really cool features to Runops. So ensure you have your seat belt and take that sip of water to stay hydrated while we go through the news!

New layout on Targets visualization

Screenshot 2022-05-02 at 13 44 56

We added a card visualization, and now it's possible to filter specific targets by their type and/or review type.

"Oh no, I liked the list visualization. What do I do now?" We got you; we didn't remove the list visualization. You just need to select which is the right for you on the right side, like it's shown in this image below

Screenshot 2022-05-02 at 13 47 04

Next steps We intend to improve the experience on this page and make its audience bigger by allowing non-admin users to access it to use it as a doorway to the targets they are allowed to see and run their scripts.

Kill task In the details of a task, you can kill a task if these two rules are true: it's your task (you're the owner), and the task has the status running. If these are true, a button will show up. This action is an attempt to kill task, and it's not guaranteed that the task will be canceled.

Screenshot 2022-05-02 at 13 51 49

Copy task URL to clipboard We added a small action to copy to task URL to your clipboard just at the side of its ID.

Screenshot 2022-05-02 at 13 54 13

That's all for now!

Cheers!

No-code Interface for Firestore in Slack: Part 1

Rogerio

Rogerio

Engineering

Hey everyone!

If you are a developer that gets stopped by PMs all the time to do simple Firestore updates, this guide will save you a lot of time.

Or, if you are a PM that wants to get less things through the dev team and faster delivery for your users: you are in the right place.

In this 2 parts guide you will learn how to create a no-code interface for updating Firestore inside Slack.

In the end you'll take 10 seconds to go from this:

To this:

For any script.

In part one you will run and share Firestore update scripts from Runops

In part two you will create no-code interfaces for these scripts in seconds

Scenario#

You are working with a Feature Flag system backed by Firestore. Different users see different colors in the product experience based on these flags.

Product Managers need to update these flags many times a week for different users, but they don't have access to Firestore.

This is the sample table structure:

{
collections: {
featureFlags: { ...featureFlags },
whitelabel : {
documents: {
label1: {
palette: {
primary: '#fff000',
secondary: '#000fff'
}
},
label2: {
palette: {
primary: '#cccbbb',
secondary: '#bbbccc'
}
}
}
}
}
}

Accessible viewing and reading: this is an image showing my particular firebase store before the script runs.

Step 1 - Signup#

Head over to https://use.runops.io/signup -- insert your email and complete the signup 馃檪



Step 2 - Add a Connection#

After signup, you get to the Tasks page. From there click on Targets.

Inside Targets, click Create then advanced configurations. Here is what to provide for each section:

Information:

Name: your Firestore table name

Type: node

Permission management:

skip for now

Secrets management:

Choose: runops-hosted.

  • This option uses the Runops environment host the connection.

Key: set a name for your environment variable, like FIREBASE_CONFIG.

Value: add a Firebase Service Account JSON with access to your table.

Here is how to create this json if you don't already have one:

  1. In the Firebase console, open Settings >聽Service Accounts.

  2. Click Generate New Private Key, then confirm by clicking Generate Key.

  3. Provide the JSON file containing the key to the value.

Here is what the Service Account json should look like:

{
"type": "{your-type}",
"project_id": "{your-project}",
"private_key_id": "{your-private_key_id}",
"private_key": "{your-private_key}",
"client_email": "{your-client_email}",
"client_id": "{your-client_id}",
"auth_uri": "{your-auth_uri}",
"token_uri": "{your-token_uri}",
"auth_provider_x509_cert_url": "{your-auth_provider_x509_cert_url}",
"client_x509_cert_url": "{your-client_x509_cert_url}"
}

Agent Tags:

Tags: test

  • Tag is a reference to your agent. In this case, use the test tag to use the agent hosted by Runops.

This is a GIF that shows the target setup flow:

Accessible viewing and reading: this gif shows us how the user is putting the data inside the form of creating a new target.

Now your connection is ready to use! 馃帀

Step 3 - Using the connection#

Now we need to create a script to run in your connection. In this example we will update a Firestore table. /Just.code.it

In Runops, head over to Tasks and select your firebase connection in the top left. Then put the script in the editor and run it!

Here is a script that updates the sample table. You can adapt it your structure:

const { initializeApp, cert } = require('firebase-admin/app');
const { getFirestore } = require('firebase-admin/firestore');
const firebaseConfig = process.env.FIREBASE_CONFIG;
const firebaseConfigParsed = JSON.parse(firebaseConfig);
initializeApp({
credential: cert(firebaseConfigParsed)
});
console.log('Access to DB');
const db = getFirestore();
const docRef = db.collection('whitelabel')
const newDoc = 'label3'
const main = async () => {
console.log('Inserting new doc with data');
const res = await docRef.doc(newDoc).set({
palette: {
primary: '#455355',
secondary: '#344566'
}
});
console.log('Finished with success', res);
}
main();

After running the script, we get the collection Whitelabel updated!

ihuuul look at this:

Accessible viewing and reading: this is an image showing my particular firebase store after the script runs

That's it for part 1, you can use Runops as safe way to access your Firestore database. You can share this script with others and have them re-run it inside Runops instead of learning Firebase UI.

But it can get waaay better. Check out part two to learn how to create a no-code interface based on this script in less than one minute.

No-code Interface for Firestore in Slack: Part 2

Rogerio

Rogerio

Engineering

This is Part 2 of the series, check out Part 1 before continuing.

All right, let鈥檚 do it!

For context, in Part 1 we set up a Runops connection to Firebase which helped us to update Firestore adding a new Whitelabel into our Feature Flag system.

And the structure looks like that:

{
collections: {
featureFlags: { ...featureFlags },
whitelabel : {
documents: {
label1: {
palette: {
primary: '#fff000',
secondary: '#000fff'
}
},
label2: {
palette: {
primary: '#cccbbb',
secondary: '#bbbccc'
}
}
}
}
}
}

Accessible viewing and reading: this is an image showing my particular firebase store before the script runs.

Enter Runops Templates#

This is where we will setup some magic. One of the reasons why Runops Templates are so powerful is because you can take any ad-hoc script and turn it into an automation in 30 seconds. All you need to do is add the script to Github.

No configurations.

No drag and drop.

No sdks.

Just add a file to Github as-is and your automation is ready.

Step 1 - Connecting Runops a Github repository#

Let's connect Runops to your Templates repository on Github. It is ideal if you create a dedicated repository for Runops Templates, but you will be fine using an existing repo just to follow along.

Head back to the Runops portal and:

  1. On the top right side, click on your organization (email or organization name) and choose the Settings option;

  2. Click on Integrate with Github button;

  3. Follow the integration flow of Github;

  4. The final step shows a Configure button , like the photo below. Click on it;

Accessible viewing and reading: this is an image showing my integration page in Github with Runops bot.

  1. Choose your organization or the organization of your company;

  2. Choose your repository. 馃帀

Here is what the whole process looks like:

Accessible viewing and reading: this is a gif showing the integration with Github.

Github integration completed! let's create our first automation.

Step 2 - Create your first automation#

We will use the same script from Part 1. But let's update two small bits. Are you able to figure it out?

Check the last article here (link of the first article) if you are curious 馃檲

const { initializeApp, cert } = require('firebase-admin/app');
const { getFirestore } = require('firebase-admin/firestore');
const firebaseConfig = process.env.FIREBASE_CONFIG;
const firebaseConfigParsed = JSON.parse(firebaseConfig);
initializeApp({
credential: cert(firebaseConfigParsed)
});
console.log('Access to DB');
const db = getFirestore();
const docRef = db.collection('whitelabel');
const newDoc = '{{label_name}}';
const main = async () => {
console.log('Inserting new doc with data');
const res = await docRef.doc(newDoc).set({
palette: {
primary: '{{primary_color}}',
secondary: '{{second_color}}'
}
});
console.log('Finished with success', res);
}
main();

I will not explain the code, but there is one important thing in a template here, the variables!

They look like: {{variable_here}}

In this code we have three variables:

  • {{label_name}}

  • {{primary_color}}

  • {{second_color}}

Now we need to insert this code as a file inside your templates Github repository! Let's do it:

Accessible viewing and reading: this is a gif showing how to add a new file in our template repository.

These variables show us the power of Templates because now we don't need to change this script every time before running. We just execute it as a template and put the values before running the script. Let's run it!

Step 3 - Running a task from a template#

This step will finish the plot of this article showing us how to execute a task from a template.

If you didn't install the Slack app during the signup process, take a moment to do so now from the Settings page.

In Slack, type /runops templates anywhere to see a lista of Templates coming directly from Github files!

Select the file we just created and voil脿! A form is created from the variables inside our script!!

You can also run the Template from the web portal:

  1. Go to the Task page;

  2. Click on the arrow part of the button New task and choose the option New task from template;

  3. Will display a list of files, the same files in your repository root! Choose the name of your file, the mine was firebase-template.js;

  4. Fulfill the inputs with values that you prefer and put the name of your target, the mine is firebase-node-tutorial;

  5. See the magic happening! 馃帀

Accessible viewing and reading: this is a gif showing how to execute a task from one template in Runops portal.

Looks amazing right!? Now you can create a lot of whitelabels or either put this template in the hands of other people more involved with this subject. 馃檪

This is one way how Runops can save your time so you can work on things that really matter.

That's all folks! See you around! Bye o/

Runops Releases #24

We got some updates on the new task page for y'all!

  • We did some design updates and added some features to the new task page. So buckle up because it's the first step to improving a lot more stuff in there:
  • We added a fancy pants Combobox component to the application, so you can search for all your targets and make your workflow smoother while switching and selecting targets 馃暫
  • Proportion and use of space were adjusted in a way you can better visualize the logs of your tasks. We managed to decrease just a little of space from the editor but make a substantial increase of space in the logs, using the same total area 馃憖
  • Selecting a target was placed on top of where the schema for your MySQL and Postgres database is loaded to improve the organization in the page 馃搫
  • For every target you select, we change the URL so you can save that link and always have direct access to the targets you work with the most 馃敆

Next steps

  • We still want to improve the use of space in the logs area, so some new design updates are coming on the tab component and in the actions of that page;
  • We aim to make this page very close to an IDE (Integrated Development Environment), so many more exciting features and improvements are under development or discussion.

Stay tuned 馃摶

Here's a quick GIF showing this experience and two screenshots comparing how it was and how it is now:

GIF showing the Combobox experience: new-design-updates-new-task

How it was: Screenshot 2022-04-19 at 14 40 53

How it is: Screenshot 2022-04-19 at 14 41 24

AWS ECS execute-command vs Kubernetes kubectl

The ability to run commands inside a running ECS task was one of the most requested features of the ECS service. Yet, AWS released a pretty bad implementation if we compare it to alternatives like Kubernetes.

Some of the key problems:

1. AWS ECS execute-command requires installing a ssm manager in the host instance.#

For each kind of environment you need to understand how to install this:

  • If it's on a managed EC2 instance, you need to install for yourself
  • If the instance is managed by ECS, you need to look up for an attribute to configure this
  • To exemplify how this is bad, you have a script to check if it's configured properly

2. aws ecs execute-command doesn't propagate the exit code properly#

All unix tools that fits in this category does that (kubectl exec, ssh, docker, etc). This is add more complexity to how we interact with the CLI. Particularly for building automations on top of it. An error in the remote execution returns as success to the the calling process. The execution only fails if the aws ecs execute-command fails to run.

3. AWS ECS doesn't use -- to separate flag/arguments commands properly#

You can't provide commands as a argument list instead of a bare string, which prevents a user to uses subprocess commands like |, ;.

Alternatives#

If we look at Kuberentes' kubectl, none of these problems exist. The implementation

kubectl exec follows basic unix principles more properly:

  • It propagates the exit-code, like every other tool in this category
  • Does not require any dependency to execute commands remotely in the container
  • Use -- to separate flag/arguments commands

The same is true for ssh, docker, and others. Let's hope AWS catches up with these features soon.

Runops Releases #23

The new year is here we want to celebrate with you sharing exciting new features that will make your life a whole lot easier.

Happy 2022 and let's dive in!

Run tasks in the Runops Portal#

Use a large canvas to edit your scripts, logs loading automatically right bellow, and all your history of executions accessible as you run things.

When you are done with a Target, easily switch targets to run different scripts.

Wanna re-run a script from yesterday?

Filter you history to find the script you need and re-execute a task with 2 clicks!

Hit cmd (mac) / win (windows) + enter to run your script from the canvas.

Here is a quick video of Matheus showing all these features in action: https://runops.io/docs/english-videos/#intro

Slack workflows#

No need to copy and paste the same script everyday anymore.

If you have a script you need to run every day like database check or a financial report, you can now use the Runops step in Slack Workflows to schedule the execution of a task with a few clicks, all within Slack.

After the setup you get a message from the Runops bot with the results of your task every day at the configured time.

Check out this video on how to setup: https://runops.io/docs/portuguese-videos/#tarefas-agendadas

You can also use the Runops workflow step to create shortcuts.

Use Shortcuts to make scripts such as a report or a db check accessible trough a button available in the Slack chat panel.

The Runops bot speaks SQL!#

If you like running tasks from Slack you will love this one. Now you can chat with the Runops bot using SQL! That is correct, you send SQL or any script in the chat and the bot replies with the results.

Here is how to use it:

  1. List the available Targets by typing targets
  2. Select a target by typing target my-target-name
  3. Start sending your script and get the results back for each message!

Check out this video for more details on how to use it: https://runops.io/docs/portuguese-videos/#chatbot

Improved Agent reliability#

We released multiple enhancements to the Agent communication with the API.

The latest Agent version has increased capabilities in terms of how long a task can last, the size of scripts and outputs it can handle, and many other enhancements and bug fixes.

If you aren't using the latest version already, we highly recommend you upgrade your agents.

You can find the latest version along with the full details of the changes here: https://github.com/runopsio/agent/releases

Welcome NodeJS!#

You can now run NodeJS scripts in Runops. Many of you use Python or Ruby to create more complex automations, but now you can also use NodeJS.

You can see the list of npm packages available here: https://github.com/runopsio/agent/blob/main/Dockerfile#L116 - feel free to ask for new packages or contribute with suggestions.

Web signup#

New organizations can now signup for Runops from the browser.

Before this release new organizations were only able to signup from our CLI. Many people love the CLI experience, but this is not the case for everyone.

So go ahead and tell your friends from other companies that they should be using Runops and can start here: https://use.runops.io/signup

Thanks!#

That is it for this update.

As always, please let us know of any feedback, suggestion, comment, or anything that comes to mind.

See you next time!

Runops Releases #22

Presenting: The Runops Web App!#

You can now use Runops in the browser! The web app experience is live for a few weeks and the feedback has been fantastic. Here are some of the things users love in the web app:

Tasks Inbox

You can view your Tasks' history in a simple, yet powerful list. In the Tasks' detail view you can see review information, script, and logs!


Admins using the web app can manage users, targets and agents!

  • Create, deactivate, and update users' groups.
  • Create and update Targets.
  • See your organizations' connected agents and metadata.

Welcome groups, farewell #

Now, instead of controlling access to Targets using roles, and reviews with teams, you use only one property:聽groups. Plus, we added the capability of聽controlling access to templates using groups.

Reliability Enhancements#

We made some structural changes in the last few weeks to Runops. Despite the size of the changes, most of you didn't even notice it. This is because we invested a lot of time in improving our reliability. We are not only fixing problems faster when they happen, but even fixing them before you notice. Some of these improvements include:

  • Runops Agent runtime instrumentation. The Runops Agent now loads and exports runtime configurations. This improves our visibility into problems and increases our abilities to fix problems in the agent without involving your team.

  • Central monitoring tool. All our systems now use the same central error monitoring and tracking system. This system makes sure that we get notifications for any problem that happens in any system. Notifications include all the metadata on the problem, including the individual user and organization. This is why you get messages from our team when you have problems even before you think about reaching out.