Categories
AI

Study provides insights on GitHub Copilot’s impact on developer productivity

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.


Recently, writing software code has become a promising use case for large language models like GPT-3. At the same time, like many developments in artificial intelligence (AI), there are concerns about how much of the excitement surrounding large language model (LLM)-powered coding is hype. 

A new study by GitHub shows that Copilot, its AI code programming assistant, results in a significant increase in developer productivity and happiness. Copilot uses Codex, a specialized version of GPT-3 trained on gigabytes of software code, to autocomplete instructions, generate entire functions, and automate other parts of writing source code.

The study comes one year after GitHub launched the technical preview of its Copilot tool and just a few months after it became publicly available. GitHub’s study surveyed more than 2,000 programmers —  mostly professional developers and students, who have used Copilot throughout the past year. 

While AI-assisted coding is still a new field and needs more research, GitHub’s study provides a good look at what to expect from tools such as Copilot.

Event

MetaBeat 2022

MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.

Register Here

Happiness and productivity 

According to the GitHub’s findings, 60–75% of developers feel “more fulfilled with their job, feel less frustrated when coding, and can focus on more satisfying work” when using its Copilot tool.

Feeling fulfilled and satisfied is a subjective experience, though there are some common traits across what developers reported.

“Knowledge workers in general – and that includes software developers – are intrigued and motivated by problem-solving, and creativity,” GitHub Researcher, Eirini Kalliamvakou, told VentureBeat. “For example, a developer tends to find it more satisfying to think about what design patterns to use, or how to architect a solution that implements a particular logic, drives an outcome, or solves a problem. Compared to that, the rote memorization of syntax or ordering of parameters is considered ‘toil’ that most developers would love to get through quickly.”

Copilot also helps developers “preserve mental effort during repetitive tasks,” 87% of the respondents reported. These are tasks that are frustrating and prone to mistakes, such as writing a SQL migration to update the schema of a database. 

“With the exception of database administrators, developers may not write SQL migrations often enough to remember all of the particular SQL syntaxes,” Kalliamvakou said. “But it’s a task that happens often enough for the mental cost of the non-immediate recall to add up. GitHub Copilot removes much of the effort in this scenario.”

Developers tend to “stay in the flow” when using Copilot, the survey found — meanings they spend less time browsing reference documents and online forums like StackOverflow to find solutions. Instead, they prompt Copilot with a text description and get a code that is mostly correct and might need a bit of tweaking.

Faster task completion

More than 90% of the survey’s respondents reported that Copilot helps them complete tasks faster —  a finding that was expected. Though, to further measure the speed improvement, GitHub conducted a more thorough experiment, recruiting 95 developers and giving them the task of writing a basic HTTP 1.1 server from scratch in JavaScript. 

The participants were divided into two groups, a test group of 45 developers who used Copilot and a control group of 50 developers who did not use the AI assistant. While task completion was not overwhelmingly different between the two groups, completion time was.  The Copilot group was able to complete the server code in less than half the time it took for the control group.

While this is an important finding, it would be more interesting to see which types of tasks Copilot helped more with and which areas required more manual coding. Although GitHub did not have figures to share in this regard, Kalliamvakou told VentureBeat that she and her group are “performing more analysis on the code the participants wrote, and plan to share more in the near future.”

Code review and security

It is worth noting that LLMs do not understand and generate code in the same way that humans do, which has raised concerns among researchers. One of these concerns, which is also mentioned in the original Codex paper, is the possibility of AI tools providing erroneous and possibly insecure code suggestions. There are also concerns that over time, developers could start accepting Copilot suggestions without reviewing the code it generates, which can cause vulnerabilities and open new attack vectors.

While GitHub’s new study does not have any information on how Copilot affects secure coding practices, Kalliamvakou said that GitHub continues to work on improving the model and code suggestions. Meanwhile, she stressed that suggestions by GitHub Copilot should be “carefully tested, reviewed, and vetted, like any other code.”

“As GitHub Copilot improves, we will work to exclude insecure or low-quality code from the training set. We think in the long-term, Copilot will be writing more secure code than the average programmer,” Kalliamvakou said.

Kalliamvakou added that GitHub’s studies of Copilot have revealed new areas where AI can help developers, including support for Markdown, better interaction between Copilot and Intellisense suggestions, and using the tool in other parts of the software development lifecycle, including testing and code review.

“Our largest investment is in improving the model, and the quality of suggestions provided by GitHub Copilot since that is the source of the noticeable benefits our users experience,” Kalliamvakou said. “Over time, we expect that GitHub Copilot will be able to remove more of the boilerplate and repetitive coding that developers see as taxing, creating more room for job satisfaction and fulfillment.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Repost: Original Source and Author Link

Categories
Security

GitHub will require all code contributors to use two-factor authentication

GitHub, the code hosting platform used by tens of millions of software developers around the world, announced today that all users who upload code to the site will need to enable one or more forms of two-factor authentication (2FA) by the end of 2023 in order to continue using the platform.

The new policy was announced Wednesday in a blog post by GitHub’s chief security officer (CSO) Mike Hanley, which highlighted the Microsoft-owned platform’s role in protecting the integrity of the software development process in the face of threats created by bad actors taking over developers’ accounts.

“The software supply chain starts with the developer,” Hanley wrote. “Developer accounts are frequent targets for social engineering and account takeover, and protecting developers from these types of attacks is the first and most critical step toward securing the supply chain.”

Even though multi-factor authentication provides significant additional protection to online accounts, GitHub’s internal research shows that only around 16.5 percent of active users (roughly one in six) currently enable the enhanced security measures on their accounts — a surprisingly low figure given that the platform’s user base should be aware of the risks of password-only protection.

By steering these users towards a higher minimum standard of account protection, GitHub hopes to boost the overall security of the software development community as a whole, Hanley told The Verge.

“GitHub is in a unique position here, just by virtue of the vast majority of open source and creator communities living on GitHub.com, that we can have a significant positive impact on the security of the overall ecosystem by raising the bar from a security hygiene perspective,” Hanley said. “We feel like it’s really one of the best ecosystem-wide benefits that we can provide, and we’re committed to making sure that we work through any of the challenges or obstacles to making sure that there’s successful adoption.”

GitHub has already established a precedent for the mandatory use of 2FA with a smaller subset of platform users, having trialled it with contributors to popular JavaScript libraries distributed through the package management software NPM. Since widely used NPM packages can be downloaded millions of times per week, they make a very attractive target for malware gangs. In some cases, hackers compromised NPM contributor accounts and used them to publish software updates that installed password stealers and crypto miners.

In response, GitHub made two-factor authentication mandatory for the maintainers of the 100 most popular NPM packages as of February 2022. The company plans to extend the same requirements to contributors to the top 500 packages by the end of May.

Insights from this smaller trial will be used to smooth out the process of rolling out 2FA across the platform, Hanley said. “I think we have a great benefit of the fact that we’ve already done this now on NPM,” he said. “We have learned a lot from that experience, in terms of feedback we’ve gotten from developers and creator communities that we’ve talked to, and we had a very active dialog about what good [practice] looks like with them.”

Broadly speaking, this means setting a long lead time for making the use of 2FA mandatory site-wide, and designing a range of onboarding flows to nudge users towards adoption well before the 2024 deadline, Hanley said.

Securing open-source software is still a pressing concern for the software industry, particularly after last year’s log4j vulnerability. But while GitHub’s new policy will mitigate against some threats, systemic challenges remain: many open source software projects are still maintained by unpaid volunteers, and closing the funding gap has been seen as a major problem for the tech industry as a whole.

Repost: Original Source and Author Link

Categories
AI

GitHub and OpenAI launch an AI Copilot tool that generates its own code

GitHub and OpenAI have launched a technical preview of a new AI tool called Copilot, which lives inside the Visual Studio Code editor and autocompletes code snippets.

Copilot does more than just parrot back code it’s seen before, according to GitHub. It instead analyzes the code you’ve already written and generates new matching code, including specific functions that were previously called. Examples on the project’s website include automatically writing the code to import tweets, draw a scatterplot, or grab a Goodreads rating.

It works best with Python, JavaScript, TypeScript, Ruby, and Go, according to a blog post from GitHub CEO Nat Friedman.

GitHub sees this as an evolution of pair programming, where two coders will work on the same project to catch each others’ mistakes and speed up the development process. With Copilot, one of those coders is virtual.

This project is the first major result of Microsoft’s $1 billion investment into OpenAI, the research firm now led by Y Combinator president Sam Altman. Since Altman took the reins, OpenAI has pivoted from a nonprofit status to a “capped-profit” model, took on the Microsoft investment, and started licensing its GPT-3 text-generation algorithm.

Copilot is built on a new algorithm called OpenAI Codex, which OpenAI CTO Greg Brockman describes as a descendant of GPT-3.

GPT-3 is OpenAI’s flagship language-generating algorithm, which can generate text sometimes indistinguishable to human writing. It’s able to write so convincingly because of its sheer size of 175 billion parameters, or adjustable knobs that allow the algorithm to connect relationships between letters, words, phrases, and sentences.

While GPT-3 generates English, OpenAI Codex generates code. OpenAI plans to release a version of Codex through its API later this summer so developers can built their own apps with the tech, a representative for OpenAI told The Verge in an email.

Codex was trained on terabytes of openly available code pulled from GitHub, as well as English language examples.

While testimonials on the site rave about the productivity gains Copilot provides, GitHub implies that not all the code utilized was vetted for bugs, insecure practices, or personal data. The company writes they have put a few filters in place to prevent Copilot from generating offensive language, but it might not be perfect.

“Due to the pre-release nature of the underlying technology, GitHub Copilot may sometimes produce undesired outputs, including biased, discriminatory, abusive, or offensive outputs,” Copilot’s website says.

Given criticisms of GPT-3’s bias and abusive language patterns, it seems that OpenAI hasn’t found a way to prevent algorithms from inheriting its training data’s worst elements.

The company also warns that the model could suggest email addresses, API keys, or phone numbers, but that this is rare and the data has been found to be synthetic or pseudo-randomly generated by the algorithm. However, the code generated by Copilot is largely original. A test performed by GitHub found that only 0.1 percent of generated code could be found verbatim in the training set.

This isn’t the first project to try to automatically generate code to help toiling programmers. The startup Kite pitches a very similar functionality, with availability on more than 16 code editors.

Right now, Copilot is in a restricted technical preview, but you can sign up on the project’s website for a chance to access it.

Repost: Original Source and Author Link

Categories
AI

GitHub launches Copilot to power pair programming with AI

Where does your enterprise stand on the AI adoption curve? Take our AI survey to find out.


Let the OSS Enterprise newsletter guide your open source journey! Sign up here.

GitHub has launched a new AI-powered pair programmer that collaborates with humans on their software development projects, suggesting lines or entire functions as the coder types.

Pair programming, for the uninitiated, is a common agile software development technique where two (usually human) programmers work in tandem at a single screen, taking turns to write code and review the output of their partner.

GitHub Copilot

Copilot, as the new GitHub tool is called, uses contextual cues to suggest new code, with users able to flip through alternatives if they don’t like Copilot’s initial suggestion, or manually edit it. Copilot also learns over time, so that the more code, docstrings, comments or function names a developer writes, the smarter Copilot should become.

Above: GitHub Copilot in action

Copilot is perhaps a little like Gmail’s smart compose feature, which suggests the next piece of text in an email response.

Machine power

The concept of what is effectively an AI-powered autocomplete for code is not entirely new. Codota and Tabnine have offered something similar for a while, and the two companies actually merged back in 2019 ahead of a $12 million fundraise for Codota, before the duo finally settled on Tabnine as the main brand name last month.

More broadly, machine programming tools are rearing their heads across the spectrum, with Microsoft recently announcing a new Power Apps (software for creating low-code business apps) feature that leverages OpenAI’s GPT-3 language model to help users choose the right formulas.

Similarly, the new GitHub Copilot feature also leans heavily on a collaboration with OpenAI, the AI research company that GitHub’s parent company Microsoft invested $1 billion last year. Copilot, though, uses a new AI system called OpenAI Codex which is touted as “significantly more capable than GPT-3 in code generation,” according to a GitHub blog post today. Given that it was trained on a dataset that incorporates more public source code, OpenAI Codex should be more knowledgeable of how developers write code, and be able to make more accurate suggestions.

OpenAI Codex was also trained on both source code and natural language, meaning that it is able to interpret comments and logic when assembling the code.

Above: GitHub Copilot in action (find files)

While GitHub’s new AI pair programmer could help experienced developers save some time, it may prove particularly fruitful for coders new to a specific language or framework — GitHub Copilot saves them from having to search elsewhere on the web for answers to their coding conundrums.

Availability

GitHub Copilot launches today in technical preview, and is available as an extension for Microsoft’s cross-platform code editor Visual Studio Code, working locally or in the cloud. While Copilot is designed to work with a broad gamut of languages and frameworks, at launch it’s particularly adept at JavaScript, Python, Ruby, TypeScript, and Go.

It is worth stressing that GitHub Copilot is not designed to write code on behalf of the developer — it’s more about helping developers by understanding their intent. GitHub also gives no guarantees that the code it generates will even work, as it doesn’t test the code — this means that it may not compile properly. So there are some risks, but it’s still very early days for Copilot.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
Tech News

GitHub now lets developers upload videos to their repositories

GitHub just added support for videos in its project repositories. This makes it easier for developers and  other contributors to show design concepts or bug reproductions without writing lengthy descriptions and uploading a bunch of screenshots.

The code hosting company started testing this feature last December, and now it’s available to everyone. You can upload .mp4 and .mov files to issues, pull requests, and discussion comments.

Credit: GitHub