This is a transcript of What's Up With That Episode 1, a 2022 video discussion between Sharon (yangsharon@chromium.org) and Dana (danakj@chromium.org).
The transcript was automatically generated by speech-to-text software. It may contain minor errors.
Welcome to the first episode of Whatâs Up With That, all about pointers! Our special guest is C++ expert Dana. This talk covers smart pointer types we have in Chrome, how to use them, and what can go wrong.
Notes:
Links:
0:00 SHARON: Hi, everyone, and welcome to the first installment of âWhat's Up With Thatâ, the series that demystifies all things Chrome. Iâm your host, Sharon, and todayâs inaugural episode will be all about pointers. There are so many types of types - which one should I use? What can possibly go wrong? Our guest today is Dana, who is one of our Base and C++ OWNERS and is currently working on introducing Rust to Chromium. Previously, she was part of bringing C++11 support to the Android NDK and then to Chrome. Today, sheâll be telling us whatâs up with pointers. Welcome, Dana!
00:31 DANA: Thank you, Sharon. It's super exciting to be here. Thank you for letting me be on your podcast thingy.
00:36 SHARON: Yeah, thanks for being the first episode. So let's just jump right in. So when you use pointers wrong, what can go wrong? What are the problems? What can happen?
00:48 DANA: So pointers are a big cause in security problems for Chrome, and thatâs what we mostly think about when things go wrong with pointers. So you have a pointer to some thing, like youâve pointed to a goat. And then you delete the goat, and you allocate some new thing - a cow. And it gets stuck in the same spot. Your pointer didnât change. Itâs still pointing to what it thinks is a goat, but there's now a cow there. And so when you go to use that pointer, you use something different. And this is a tool that malicious actors use to exploit software, like Chrome, in order to gain access to your system, your information, et cetera.
01:39 SHARON: And we want to avoid those. So what's that general type of attack called?
01:39 DANA: Thatâs a Use-After-Free because you have freed the goat and replaced it with a cow. And youâre using your pointer, but the thing it pointed to was freed. There are other kinds of pointer badness that can happen. If you take a pointer and you add to it some number, or you go to an offset off the pointer, and you have an array of five things, and you go and read 20, or minus 2, or something, now youâre reading out of bounds of that memory allocation. And thatâs not good. these are both memory safety bugs that occur a lot with pointers.
02:23 SHARON: Today, weâll be mostly looking at the Use-After-Free kind of bugs. We definitely see a lot of those. And if you want to see an example of one being used, Dana has previously done a talk called, âLife of a Vulnerability.â Itâll be linked below. You can check that out. So that being said, should we ever be using just a regular raw pointer in C++ in Chrome?
02:41 DANA: First of all, letâs call them native pointers. You will see them called raw pointers a lot in literature and stuff. But later on, weâll see why that could be a bit ambiguous in this context. So weâll call them a native pointer. So should you use a native pointer? If you donât want to Use-After-Free, if you donât want a problem like that, no. However, there is a performance implication with using smart pointers, and so the answer is yes. The style guide that we have right now takes this pragmatic approach of saying you should use raw pointers for giving access to an object. So if youâre passing them as a function parameter, you can share it as a pointer or a reference, which is like a pointer with slightly different rules. But you should not store native pointers as fields and objects because that is a place where they go wrong a lot. And you should not use a native pointer to express ownership. So before C++11, you would just say, this is my pointer, use a comment, say this one is owning it. And then if you wanted to pass the ownership, you just pass this native pointer over to something else as an argument, and put a comment and say this is passing ownership. And you just kind of hope it works out. But then itâs very difficult. It requires the programmer to understand the whole system to do it correctly. There is no help. So in C++11, the type called std::optional_ptr
- or sorry, std::unique_ptr
- was introduced. And this is expressing unique ownership. Thatâs why itâs called unique_ptr
. And itâs just going to hold your pointer, and when it goes out of scope, it gets deleted. It canât be copied because itâs unique ownership. But it can be moved around. And so if you're going to express ownership to an object in the heap, you should use a unique_ptr
.
04:48 SHARON: That makes sense. And that sounds good. So you mentioned smart pointers before. You want to tell us a bit more about what those are? It sounds like unique_ptr
is one of those.
04:55 DANA: Yes, so a smart pointer, which can also be referred to as a pointer-like object, perhaps as a subset of them, is a class that holds inside of it a pointer and mediates access to it in some way. So unique_ptr
mediates access by saying I own this pointer, I will delete this pointer when I go away, but Iâll give you access to it. So you can use the arrow operator or the star operator to get at the underlying pointer. And you can construct them out of native pointers as well. So thatâs an example of a smart pointer. Thereâs a whole bunch of smart pointers, but thatâs the general idea. I'm going to add something to what a native pointer is, while giving you access to it in some way.
05:40 SHARON: That makes sense. Thatâs kind of what our main thing is going to be about today because you look around in Chrome, youâll see a lot of these wrapper types. Itâll be a unique_ptr
and then a type. And youâll see so many types of these, and talking to other people, myself, I find this all very confusing. So weâll cover some of the more common types today. We just talked about unique pointers. Next, talk about absl::optional
. So why donât you tell us about that.
06:10 DANA: So thatâs actually a really good example of a pointer-like object thatâs not actually holding a pointer, so itâs not a smart pointer. But it looks like one. So this is this distinction. So absl::optional
, also known as std::optional
, if youâre not working in Chromium, and at some point, we will hopefully migrate to it, std::optional
and absl::optional
hold an object inside of it by value instead of by pointer. This means that the object is held in that space allocated for the optional
. So the size of the optional
is the size of the thing itâs holding, plus some space for a presence flag. Whereas a unique_ptr
holds only a pointer. And its size is the size of a pointer. And then the actual object lives elsewhere. So thatâs the difference in how you can think about them. But otherwise, they do look quite similar. An optional
is a unique ownership because itâs literally holding the object inside of it. However, an optional
is copyable if the object inside is copyable, for instance. So it doesnât have quite the same semantics. And it doesnât require a heap allocation, the way unique_ptr
does because itâs storing the memory in place. So if you have an optional
on the stack, the object inside is also right there on the stack. Thatâs good or bad, depending what you want. If youâre worried about your object sizes, not so good. If you're worried about the cost of memory allocation and free, good. So this is the trade-off between the two.
07:51 SHARON: Can you give any examples of when you might want to use one versus the other? Like you mentioned some kind of general trade-offs, but any specific examples? Because Iâve definitely seen use cases where unique_ptr
is used when maybe an optional
makes more sense or vice versa. Maybe itâs just because someone didn't know about it or it was chosen that way. Do you have any specific examples?
08:14 DANA: So one place where you might use a unique_ptr
, even though optional
is maybe the better choice, is because of forward declarations. So because an optional
holds the type inside of it, it needs to know the type size, which means it needs to know the full declaration of that type, or the whole definition of that type. And a unique_ptr
doesnât because itâs just holding a pointer, so it only needs to know the size of a pointer. And so if you have a header file, and you donât want to include another header file, and you just want to forward declare the types, you canât stick an optional of that type right there because you donât know how big itâs supposed to be. So that might be a case where itâs maybe not the right choice, but for other constraining reasons, you choose to use a unique_ptr
here. And you pay the cost of a heap allocation and free as a result. But when would you use an optional
? So optional
is fantastic for returning a value sometimes. I want to do this thing, and I want to give you back a result, but I might fail. Or sometimes thereâs no value to give you back. Typically, before C++ - what are we on now, was it came in 14? Iâm going to say it wrong. Thatâs OK. Before we had absl::optional
, you would have to do different tricks. So you would pass in a native pointer as a parameter and return a bool as the return value to say did I populate the pointer. And yes, that works. But itâs easy to mess it up. It also generates less optimal code. Pointers cause the optimizer to have troubles. And it doesnât express as nicely what your intention is. A return, this thing, sometimes. And so in place of using this pointer plus bool, you can put that into a single type, return an optional
. Similar for holding something as a field, where you want it to be held inline in your class, but you donât always have it present, you can do that with an optional
now, where you would have probably used a pointer before. Or a union
or something, but that gets even more tricky. And then another place you might use it as a function argument. However, thatâs usually not the right choice for a function argument. Why? Because the optional
holds the value inside of it. Constructing an optional
requires constructing the whole object inside of it. And so thatâs not free. It can be arbitrarily expensive, depending on what your type is. And if your caller to your function doesnât have already an optional
, they have to go and construct it to pass it to you. And thatâs a copy or move of that inner type. So generally, if youâre going to receive a parameter, maybe sometimes, the right way to spell that is just to pass it as a native pointer, which can be null, when it's not present.
11:29 SHARON: Hopefully that clarifies some things for people who are trying to decide which one best suits their use case. So moving on from that, some people might remember from a couple of years ago that instead of being called absl::optional
, it used to be called base::optional
. And do you want to quickly mention why we switched from base
to absl
? And you mentioned even switching to std::optional
. Why this transition?
11:53 DANA: Yeah, absolutely. So as the C++ standards come out, we want to use them, but we canât until our toolchain is ready. Whatâs our toolchain? Itâs our compiler, our standard library - and unfortunately, we have more than one compiler that we need to worry about. So we have the NaCl compiler. Luckily, we just have Clang for the compiler choice we really have to worry about. But we do have to wait for these things to be ready, and for a code base to be ready to turn on the new standard because sometimes there are some non-backwards compatible changes. But we can forward port stuff out of the standard library into base. And so weâve done that. We have a bunch of C++20 backports in base now. We had 17 backports before. We turned on 17, now they should hopefully be gone. And so base::optional
was an example of a backport, while optional
was still considered experimental in the standard library. We adopted use of absl
since then, and absl
had also, essentially, a backport of the optional
type inside of it for presumably the same reasons. And so why have two when you can have one? Thatâs a pretty good rule. And so we deprecated the base
one, removed it, and moved everything to the absl
one. One thing to note here, possibly interest, is we often add security hardening to things in base
. And so sometimes there is available in the standard library something. But we choose not to use it and use something in base
or absl
, but we use it in base
instead, because we have extra hardening checks. And so part of the process of removing base::optional
and moving to absl::optional
was ensuring those same security hardening checks are present in absl
. And weâre going to have to do the same thing to stop using absl
and start using the standard one. And that's currently a work in progress.
13:48 SHARON: So letâs go through some of the base
types because thatâs definitely where the most of these kind of wrapper types live. So letâs just start with one that I learned about recently, and thatâs a scoped_refptr
. What's that? When should we use it?
13:59 DANA: So scoped_refptr
is kind of your Chromium equivalent to shared_ptr
in the standard library. So if youâre familiar with that, itâs quite similar, but it has some slight differences. So what is scoped_refptr
? It gives you shared ownership of the underlying object. And itâs a smart pointer. It holds a pointer to an object thatâs allocated in the heap. When all scoped_refptr
that point to the same object are gone, itâll be deleted. So itâs like unique_ptr
, except it can be copied to add to your ref count, basically. And when all of them are gone, itâs destroyed. And it gives access to the underlying pointer in exactly the same ways. Oh, but why is it different than shared_ptr
? I did say it is. scoped_refptr
requires the type that is held inside of it to inherit from RefCounted
or RefCountedThreadSafe
. shared_ptr
doesnât require this. Why? So shared_ptr
sticks an allocation beside your object and then puts your object here. So the ref count is externalized to your object being stored and owned by the shared pointer. Chromium took this position to be doing intrusive ref counting. So because we inherit from a known type, we stick the ref count in that base class, RefCounted
or RefCountedThreadSafe
. And so that is enforced by the compiler. You must inherit from one of these two in order to be stored and owned in a scoped_refptr
. Whatâs the difference? RefCounted
is the default choice, but itâs not thread safe. So the ref counting is cheap. Itâs the more performant one, but if you have a scoped_refptr
on two different threads owning the same object, their ref counting will race, can be wrong, you can end up with a double free - which is another way that pointers can go wrong, two things freeing the same thing - or you could end up with potentially not freeing it at all, probably. I guess Iâve never checked if thatâs possible. But they can race, and then bad things happen. Whereas, RefCountedThreadSafe
gives you atomic ref counting. So atomic means that across all threads, theyâre all going to have the same view of the state. And so it can be used across threads and be owned across threads. And the tricky part there is the last thread that owns that object is where itâs going to be destroyed. So if your objectâs destructor does things that you expect to happen on a specific thread, you have to be super careful that you synchronize which thread that last reference is going away on, or it could explode in a really flaky way.
17:02 SHARON: This sounds useful in other ways. What are some kind of more design things to consider, in terms of when a scoped_refptr
is useful and does help enforce things that you want to enforce, like relative lifetimes of certain objects?
17:15 DANA: Generally, we recommend that you donât use ref counting if you can help it. And thatâs because itâs hard to understand when itâs going to be destroyed, like I kind of alluded to with the thread situation. Even in a single thread situation, how do you know which one is the last reference? And is this object going to outlive that other object? Maybe sometimes. Itâs not super obvious. Itâs a little more clear with a unique_ptr
, at least local to where that unique_ptr
âs destruction is. But thereâs usually no scoped_refptr
. You can say this is the last one. So I know itâs gone after this thing is gone. Maybe it is, maybe itâs not, often. So itâs a bit tricky. However, there are scenarios when you truly want a bunch of things to have access to a piece of data. And you want that data to go away when nobody needs it anymore. And so that is your use case for a scoped_refptr
. It is nicer when that thing being with shared ownership is not doing a lot of interesting things, especially in its destructor because of the complexity thatâs involved in shared ownership. But you're welcome to shoot yourself in the foot with this one if you need to.
18:33 SHARON: We're hoping to help people not shoot themselves in the foot. So use scoped_refptr
carefully, is the lesson there. So you mentioned shared_ptr
. Is that something we see much of in Chrome, or is that something that we generally try to avoid in terms of things from the standard library?
18:51 DANA: That is something that is banned in Chrome. And thatâs just basically because we already have scoped_refptr
, and we donât want two of the same thing. Thereâs been various times where people have brought up why do we need to have both? Can we just use shared_ptr
now? And nobodyâs ever done the kind of analysis needed to make that kind of decision. And so we stay with what we're at.
19:18 SHARON: If you want to do that, thereâs someone thatâll tell you what to do. So something that when I was using scoped_refptr
, I came across that you need a WeakPtrFactory to create such a pointer. So weak pointers and WeakPtr factories are one of those things that you see a lot in Chrome and one of these base things. So tell us a bit about weak pointers and their factories.
19:42 DANA: So WeakPtr and WeakPtrFactory have a bit of an interesting history. Their major purpose is for asynchronous work. Chrome is basically a large asynchronous machine, and what does that mean? It means that we break all of the work of Chrome up into small pieces of work. And every time youâve done a piece, you go and say, OK, Iâm done. And when the next piece is ready, run this thing. And maybe that next thing is like a user input event, maybe thatâs a reply from the network, whatever it might be. And thereâs just a ton of steps in things that happen in Chrome. Like, a navigation has a request, a response, maybe another request - some redirects, whatever. Thatâs an example of tons of smaller asynchronous tasks that all happen independently. So what goes wrong with asynchronous tasks? You donât have a continuous stack frame. What does that mean? So if youâre just running some synchronous code, you make a variable, you go off and you do some things, you come back. Your variable is still here, right? Youâre in this stack frame and you can keep using it. You have asynchronous tasks. You make a variable, you go and do some work, and you are done your task. Boop, your stackâs gone. You come back later, youâre going to continue. You donât have your variable anymore. So any state that you want to keep across your various tasks has to be stored and what we call bound in with that task. If thatâs a pointer, thatâs especially risky. So we talked earlier about Use-After-Frees. Well, you can, I hope, imagine how easy it is to stick a pointer into your state. This pointer is valid, Iâm using it. I go away, I come back when? I donât know, sometime in the future. And Iâm going to go use this pointer. Is it still around? I donât own it. I didnât use a unique_ptr
. So who owns it? How do they know that I have a task waiting to use it? Well, unless we have some side channel communicating that, they donât. And how do I know if theyâve destroyed it if we donât have some side channel communicating that? I donât know. And so I'm just going to use this pointer and bad things happen. Your bank account is gone.
22:06 SHARON: No! My bank account!
22:06 DANA: I know. So whatâs the side channel? The side channel that we have is WeakPtr. So a WeakPtr and WeakPtrFactory provide this communication mechanism where WeakPtrFactory watches an object, and when the object gets destroyed, the WeakPtrFactory inside of it is destroyed. And that sets this little bit that says, Iâm gone. And then when your asynchronous task comes back with its pointer, but itâs a WeakPtr inside of it and tries to run, it can be like, am I still here? If the WeakPtrFactory was destroyed, no, Iâm not. And then you have a choice of what to do at that point. Typically, weâre like, abandon ship. Donât do anything here. This whole task is aborted. But maybe you do something more subtle. That's totally possible.
22:59 SHARON: I think the example I actually meant to say that uses a WeakPtrFactory is a SafeRef, which is another base type. So tell us a bit about SafeRefs.
23:13 DANA: WeakPtr is cool because of the side channel that you can examine. So you can say are you still alive, dear object? And it can tell you, no, itâs gone. Or yeah, itâs here. And then you can use it. The problem with this is that in places where you as the code author want to believe that this object is actually always there, but you donât want a security bug if youâre wrong. And it doesnât mean that youâre wrong now, even. Sometime later, someone can change code, unrelated to where this is, where the ownership happens, and break you. And maybe they donât know all the users of a given object and changing its lifetime in some subtle way, maybe not even realizing they are. Suddenly youâre eventually seeing security bugs. And so thatâs why native pointers can be pretty scary. And so SafeRef is something we can use instead of a native pointer to protect you against this type of bug. Itâs built on top of WeakPtr and WeakPtrFactory. That's its relationship, but its purpose is not the same. so what SafeRef does is it says - SafePtr?
24:31 SHARON: SafeRef.
24:31 DANA: SafeRef.
24:31 SHARON: I think there's also a safe pointer, but there -
24:38 DANA: We were going to add it. Iâm not sure if itâs there yet. But so two differences between SafeRef and WeakPtr then, ref versus ptr, it canât be null. So itâs like a reference wrapper. But the other difference is you canât observe whether the object is actually alive or not. So it has the side channel, but it doesnât show it to you. Why would you want that? If the information is there anyway, why wouldnât you want to expose it? And the reason is because you are documenting that you as the author understand and expect that this pointer is always valid at this time. It turns out itâs not valid. What do you do? If itâs a WeakPtr, people tend to say, we donât know if itâs valid. Itâs a WeakPtr. Letâs check. Am I valid? And if Iâm not, return. And what does that result in? It results in adding a branch to your code. You do that over, and over, and over, and over, and static analysis, which is what we as humans have to do - weâre not running the program, weâre reading the code - canât really tell what will happen because thereâs so many things that could happen. We could exit here, we could exit there, we could exit here. Who knows. And that makes it increasingly hard to maintain and refactor the code. So SafeRef gives you the option to say this is always going to be valid. You canât check it. So if itâs not valid, go fix that bug somewhere else. It should be valid here.
26:16 SHARON: So what kind of -
26:16 DANA: The assumptions are broken.
26:16 SHARON: So what kind of errors happen when that assumption is broken? Is that a crash? Is that a DCHECK kind of thing?
26:22 DANA: For SafeRef and for WeakPtr, if you try to use it without checking it, or write it incorrectly, they will crash. And crashing in this case means a safe crash. Itâs not going to lead to a security bug. Itâs literally just terminating the program.
26:41 SHARON: Does that also mean you get a sad tab as a user? Like when the little sad file comes up?
26:47 DANA: Yep. It would. If youâre in the render process, you take it down. Itâs a sad tab. So thatâs not great. Itâs better than a security bug. Because your options here are donât write bugs. Ideal. I love that idea, but we know that bugs happen. Use a native pointer, security problem. Use a WeakPtr, that makes sense if you want it to sometimes not be there. But if you want it to always be there - because you have to make a choice now of what youâre supposed to do if itâs not, and it makes the code very hard to understand. And youâre only going to find out it canât be there through a crash anyhow. Or use a SafeRef. And itâs going to just give you the option to crash. Youâre going to figure out whatâs wrong and make it no longer do that.
27:38 SHARON: I think wanting to guarantee the lifetime of some other things seems like a pretty common thing that you might come across. So Iâm sure there are many cases for many people to be adding SafeRefs to make their code a bit safer, and also ensure that if something does go wrong, itâs not leading to a memory bug that could be exploited in who knows how long. Because we donât always hear about those. If it crashes, and they can reliably crash, at least you know itâs there. You can fix it. If itâs not, weâre hoping that one of our VRP vulnerability researchers find it and report it, but that doesnât always happen. So if we can know about these things, thatâs good. So another new type in base that people might have been seeing recently is a raw_ptr
which is maybe why earlier we were saying letâs call them native pointers, not raw pointers. Because the difference between raw_ptr
and raw pointer, very easy to mix those up. So why donât you tell us a bit about raw_ptr
s?
28:40 DANA: So raw_ptr
is really cool. Itâs a non-owning smart pointer. So thatâs kind of like WeakPtr or SafeRef. These are also non-owning. And itâs actually very similar in inspiration to what WeakPtr is. So it has a side channel where it can see if the thing itâs pointing to is alive or gone. So for WeakPtr, it talks to the WeakPtrFactory and says âam I deleted?â And for raw_ptr
, what it does is it keeps a reference count, kind of like scoped_refptr
, but itâs a weak reference count. Itâs not owning. And it keeps this reference count in the memory allocator. So Chrome has its own memory allocator for new
and delete
called PartitionAlloc. And that lets us do some interesting stuff. And this is one of them. And so what happens is as long as there is raw_ptr
around, this reference count is non-zero. So even if you go and you delete the object, the allocator knows there is some pointer to it. Itâs still out there. And so it doesnât free it. It holds it. And it poisons the memory, so that just means itâs going to write some bit pattern over it, so itâs not really useful anymore. Itâs basically re-initialized the memory. And so later, if you go and use this raw_ptr
, you get access to just dead memory. Itâs there, but itâs not useful anymore. Youâre not going to be able to create security bugs in the same way. Because when we first started talking about a Use-After-Free - you have your goat, you free it, a cow is there, and now your pointer is pointing at the wrong thing - you canât do that because as long as thereâs this raw_ptr
to your goat, the goat can be gone, but nothing else is going to come back here. Itâs still taken by that poisoned memory until all the raw_ptr
s are gone. So thatâs their job, to protect us from a Use-After-Free being exploitable. It doesnât necessarily crash when you use it incorrectly, you just get to use this bad memory inside of it. If you try to use it as a pointer, then youâre using a bad pointer, youâre going to probably crash. But itâs a little bit different than a WeakPtr, which is going to deterministically crash as soon as you try to use it when itâs gone. Itâs really just a protection or a mitigation against security exploits through Use-After-Free. And then we recently just added raw_ref
, which is really the same as raw_ptr
, except addressing nullability. So smart pointers in C++ have historically all allowed a null state. Thatâs representative of what native pointers did in C and C++. And so this is kind of just bringing this along in this obvious, historical way. But if you look at other languages that have been able to break with history and make their own choices kind of fresh, we see that they make choices like not having null pointers, not having null smart pointers. And that increases the readability and the understanding of your code greatly. So just like for WeakPtr, how we said, we just check if itâs there or not. And if itâs not, we return, and so on. Itâs every time you have a WeakPtr, if you were thinking of a timeline, every time you touch a WeakPtr, your timeline splits. And so you get this exponential timeline of possible states that your softwareâs in. Thatâs really intense. Whereas every time you can not do that, say this canât be null, so instead of WeakPtr, youâre using SafeRef. This canât be not here or null, actually - WeakPtr can just be straight up null - this is always present. Then you donât have a split in your timeline, and that makes it a lot easier to understand what your software is doing. And so for raw_ptr
, it followed this historical precedent. It lets you have a null value inside of it. And raw_ref
is our kind of modern answer to this new take on nullability. And so raw_ref
is a reference wrapper, meaning it holds a reference inside of it, conceptually, meaning it just canât be null. That is just basically - itâs a pointer, but it can't be null.
33:24 SHARON: So these do sound the most straightforward to use. So basically, if you're not sure - for your class members at least - any time you would use a native pointer or an ampersand, basically you should always just put those in either a raw_ptr
or a raw_ref
, right?
33:45 DANA: Yeah, thatâs what our style guide recommends, with one nuance. So because raw_ptr
and raw_ref
interact with the memory allocator, they have the ability to be like, turned on or off dynamically at runtime. And thereâs a performance hit on keeping this reference count around. And so at the moment, they are not turned on in the renderer process because itâs a really performance-critical place. And the impact of security bugs there is a little less than in the browser process, where you just immediately get access to the whole system. And so weâre working on turning it on there. But if youâre writing code thatâs only in the renderer process, then thereâs no point to use it. And we donât recommend that you use it. But the default rule is yes. Donât use a native pointer, donât use a native reference. As a field to an object, use a raw_ptr
, use a raw_ref
. Prefer raw_ref
- prefer something with less states, always, because you get less branches in your timeline. And then you can make it const
if you donât want it to be able to rebound to a new object, if you donât want the pointer to change. Or you can make it mutable if you wanted to be able to.
34:58 SHARON: So you did mention that these types are ref counted, but earlier you said that you should avoid ref counting things. So -
35:04 DANA: Yes.
35:11 SHARON: So whatâs the balance there? Is it because with a scoped_refptr
, youâre a bit more involved in the ref counting, or is it just, this is we've done it for you, you can use it. This is OK.
35:19 DANA: No, this is a really good question. Thank you for asking that. So thereâs two kinds of ref counts going on here. I tried to kind of allude to it, but itâs great to make it clear. So scoped_refptr
is a strong ref count, meaning the ref count owns the object. So the destructor runs, the object is gone and deleted when that ref count goes to 0. raw_ref
and raw_ptr
are a weak ref count. They could be pointing to something owned in a scoped_refptr
even. So they can exist at the same time. You can have both kind of ref counts going at the same time. A weak ref count, in this case, is holding the memory alive so that it doesnât get re-used. But itâs not keeping the object in that memory alive. And so from a programming state point-of-view, the weak refs donât matter. Theyâre helping protect you from security bugs. When things go wrong, when a bug happens, theyâre helping to make it less impactful. But they donât change your program in a visible way. Whereas, strong references do. That destrutorâs timing is based on when the ref count goes to 0 for a strong reference. So thatâs the difference between these two.
36:46 SHARON: So when you say donât use ref counting, you mean donât use strong ref counting.
36:46 DANA: I do, yes.
36:51 SHARON: And if you want to learn more about the raw pointer, raw_ptr
, raw_ref
, thatâs all part of the MiraclePtr project, and thereâs a talk about that from BlinkOn. Iâll link that below also. So in terms of other base types, thereâs a new one thatâs called base::expected
. I havenât even really seen this around. So can you tell us a bit more about how we use that, and what that's for?
37:09 DANA: base::expected
is a backport from C++23, I want to say. So the proposal for base::expected
actually cites a Rust type as inspiration, which is called std::result
in Rust. And itâs a lot like optional
, so itâs used for return values. And itâs more or less kind of a replacement for exceptions. So Chrome doesnât compile with exceptions enabled even, so weâve never relied on exceptions to report errors. But we have to do complicated things, like with optional
to return a bool or an enum. And then maybe some value. And so this kind of compresses all that down into a single type, but itâs got more state than just an option. So expected
gives you two choices. It either returns your value, like optional
can, or it returns an error. And so thatâs the difference between optional
and expected
. You can give a full error type. And so this is really useful when you want to give more context on what went wrong, or why youâre not returning the value. So it makes a lot of sense in stuff like file IO. So youâre opening a file, and it can fail for various reasons, like I donât have permission, it doesnât exist, whatever. And so in that case, the way you would express that in a modern way would be to return base::expected
of your file handle or file class. And as an error, some enumerator, perhaps, or even an object that has additional state beyond just I couldnât open the file. But maybe a string about why you couldn't open the file or something like this. And so it gives you a way to return a structured error result.
39:05 SHARON: Thatâs found useful in lots of cases. So all of these types are making up for basically what is lacking in C++, which is memory safety. C++, it does a lot. Itâs been around for a long time. Most of Chrome is written in it. But there are all these memory issues. And a lot of our security bugs are a result of this. So you are working on bringing Rust to Chromium. Why is that a good next step? Why does that solve these problems we're currently facing?
39:33 DANA: So Rust has some very cool properties to it. Its first property that is really important to this conversation is the way that it handles pointers, which in Rust would be treated pretty much exclusively as references. And what Rust does is it requires you to tell the compiler the relationships between the lifetimes of your references. And the outcome of this additional knowledge to the compiler is memory safety. And so what does that mean? It means that you canât write a Use-After-Free bug in Rust unless youâre going into the unsafe part of the language, which is where scariness exists. But you donât need to go there to write a normal program. So weâll ignore it. And so what that means is you canât write the bug. And so that doesnât just mean I also like to believe I can write C++ without a bug. Thatâs not true. But I would love to believe that. But it means that later, when I come back and refactor my code, or someone comes whoâs never seen this before and fixes some random bug somewhere related to it, they canât introduce a Use-After-Free either. Because if they do, the compiler is like, hey - itâs going to outlive it. You canât use it. Sorry. And so thereâs this whole class of bugs that you never have to debug, you never ship, they never affect users. And so this is a really nice promise, really appealing for a piece of software like Chrome, where our basic purpose is to handle arbitrary and adversarial data. You want to be able to go on some web page, maybe itâs hostile, maybe not. You just get a link. You want to be able to click that link and trust that even if itâs really hostile and wanting to destroy you, it canât. Chrome is that safety net for you. And so Rust is that kind of safety net for our code, to say no matter how you change it over time, itâs got your back. You can't introduce this kind of bug.
42:03 SHARON: So this Rust project sounds really cool. If people want to learn more or get involved - if you're into the whole languages, memory safety kind of thing - where can people go to learn more?
42:09 DANA: So if youâre interested in helping out with our Rust experiment, then you can look for us in the Rust channel on Slack. If youâre interested in C++ language stuff, you can find us in the CXX channel on Slack, as well. As well as the cxx@chromium.org mailing list. And there is, of course, the rust-dev@chromium.org mailing list if you want to use email to reach us as well.
42:44 SHARON: Thank you very much, Dana. There will be notes from all of this also linked in the description box. And thank you very much for this first episode.
42:52 DANA: Thanks, Sharon This was fun.