Thoughts on Security
Last week we launched Kite, a copilot for programmers. We’ve been excited about the Kite vision since 2014—we’re blown away by how many of you are excited about it too!
The response far exceeded our expectations. We had over a thousand upvotes on Hacker News; we were in the all time top 1% of launches on Product Hunt; and we had over two thousands tweets about Kite, not counting retweets. We couldn’t be more grateful to those who believed in the vision and took the time to share Kite with a friend or join the discussion online.
That said, we have a lot of work to do. Kite is the first product of its kind, which means we’re pioneering some new terrain. We signed up for this, and are committed to getting it right.
Why Cloud? Garmin versus Waze.
The first question is: why keep the copilot logic in the cloud, instead of locally as part of the Kite install? The short answer is we can build a better experience if Kite is a cloud service.
The full answer is a long list of things that are better about cloud services. Editors today are Garmin GPS, and Kite is Waze. Some folks still use Garmin GPS due to privacy concerns, but most of the world uses internet-connected navigation for its many advantages: fresher maps, more coverage, better tuned navigation algorithms, better user experience because iteration is 10x cheaper, etc.
The same patterns apply to Kite. I’d like to give three quick examples, and then talk about the larger strategy.
- Data by the Terabyte. Kite uses lots of data to power the copilot experience. We index public documentation, maintain maps of the Python world (e.g.
scipy.arrayis an alias for
numpy.array), and surface patterns extracted from all of Github. We keep all of this in RAM, so you don’t have to. We run servers with 32 GB of RAM; while some of you may have that kind of rig (we’re jealous!), the typical Macbook Pro doesn’t. This data set will grow as we add support for more programming languages and more functionality. With a cloud-based architecture you don’t need to preselect which languages you’ll use, or sacrifice gigabytes of memory on your machine.
- Machine Learning. Kite is powered by a number of statistical models, and we’re adding more over time. For example, Kite’s search and “Did you mean” features both use machine learning. Of course we could ship these to your local client, but our models will get smarter over time if we know which result you clicked on (like Google Search) and whether you accepted a suggested change to your code (like Google Spellcheck).
- Rapid ship cycles. We ship multiple times per week. This means our bugs get fixed faster, data is fresher, and you get the newest features as soon as possible.
The cloud and its resulting feedback loops lead to better products, faster. We’ve seen the same evolution across a number of verticals. A few examples:
- Outlook → Gmail
- Colocation → AWS
- Network File Share → Dropbox
- MS Office → Google Docs
In each of these cases, security had to be addressed. At first it wasn’t clear the world would make the jump. It didn’t happen all at once, and there are still people using the legacy technologies. This evolution takes time, and overall is very healthy.
So what does Kite need to do as a company excited about the possibilities of cloud-connected programming?
Security: Core Principles
Let’s talk about the security concerns that naturally arise from a cloud-powered programming copilot. As software developers, security has naturally been on our minds since the beginning. Frankly many of us here at Kite would have left similar comments on the HN thread :). Many of you are rightfully concerned about security as well, so let’s jump in.
Our approach to security begins with a few core principles:
- Security is a journey, not a destination. We will never be done giving you the tools you need to control your data. We will also never be done earning your trust.
- Control. You should control what data gets sent to Kite’s backend and whether you want us to store it for your later use. We should offer as much control as we can.
- Transparency. You should understand what is happening. We need to communicate this repeatedly, and clearly.
- We’re building the future together with you. We don’t presume to have all of the answers. We want to work with all of you to find the best solutions.
We are committed to these principles. We want you and your employer to be excited about using Kite, and we think these principles are a good first step.
Let’s look at some examples of how we’ll put these principles into action.
You should be able to control
- Which directories and files, if any, are indexed by Kite,
- If Kite should remember your history of code changes,
- If Kite should help with terminal commands,
- If Kite should remember terminal commands you’ve previously written,
- If Kite should remember the output of past terminal commands,
- …and you should be able to easily turn these switches on and off.
- If you change a setting, we should ask if you’d like to delete historical data, as applicable.
You should always be able to see
- What files Kite has indexed (and permanently remove them as needed),
- What terminal commands, or file edits, are being remembered by Kite (and permanently remove them as needed),
- …and Kite should check in periodically to verify that your security settings match your preferences.
These are the first levels of control and transparency, which are based on files, directories, and the type of information (terminal versus editor).
Secrets, like passwords or keys, are a category of content that deserve special attention. We don’t want secrets on our servers, and we will be developing multiple mechanisms (automated and manual) to make sure they stay off our servers. We don’t have specifics to announce yet, but we believe we will set industry standards that will be adopted across multiple categories of tools such as continuous integration and code review systems.
Since last week’s launch we have begun adding some of these principles into the product. I’d like to show you one feature we shipped yesterday. It’s called the lockout screen.
Kite’s Security panel asks users to whitelist the directories that Kite should be turned on for. Code living outside of this whitelist never gets read by Kite. So what should the sidebar show if you open a Python file outside of the whitelist? As of yesterday’s addition, you’ll see something like this:
This interaction embodies the principles of transparency and control. It communicates what is happening, why, and gives you a one-click control mechanism to change what’s happening, if you so choose.
The Future Ahead
We are committed to incorporating the principles of control and transparency into the foundations of Kite. We will write more about security on our blog as we design and implement these features.
That said, we realize that everyone has different needs. We can’t promise that the options and functionality we choose on day 1 will be perfect for everyone, but we’re working day and night to expand the circle as widely as possible. We’ll do this tirelessly over the long term.
We’d love to hear your thoughts along the way. It’s only been a week, but all of you have been incredibly helpful as we learn how to get this right. As always, we encourage you to talk with us on Twitter at @kitehq.
Nothing makes us happier than knowing so many of you are equally excited about the Kite vision. The future of programming is awesome. Let’s build it together!
P.S. We are hiring! We are looking for frontend web devs, generalist systems engineers, programming language devs, and mac/windows/linux developers. You can reach us at firstname.lastname@example.org.