Yan Cui: AppSync, VTL, and Creating Tech Content in 2021
Yan joins Adam to discuss their shared love for AppSync & VTL and his experiences as a thought leader and content creator in the serverless space.
Yan is one of the most respected names in the serverless space and an AWS Serverless Hero.
He is currently helping organizations around the world accelerate product and feature development by adopting serverless technologies and helping them avoid costly mistakes.
Adam: Hey everyone. Welcome to AWS FM, a live audio show with guests from around the AWS community. I'm your host Adam Elmore, and today I'm joined by the Burning Monk Yan Cui. Hi, Yan.
Yan: Hey, Adam. Good to be here. Thank you for inviting me.
Adam: Yeah, thank you so much. I told you before the call, one of the reasons for doing this for me is to talk with folks like you. You've been a hero of mine. You're an AWS hero. I'd love to just hear your journey as it pertains to AWS, or more broadly just in technology starting there with your story.
Yan: Yeah, sure. So, I guess I started my professional development career working for Credit Suisse, which is an investment bank. I was in London at the time just coming out of university. I actually won one of the sponsored awards that Credit Suisse had with my university just before my final year. So, they gave me an internship as part of that, and at the end of internship they offered me a job.
Yan: As a student, you're thinking, "Well, all my friends are looking for jobs in the final year. I've got enough jobs ready for me to go into. It's kind of silly not to take it." And at the time the money was pretty good as well. So, I took the job. But I think that the one thing I pretty quickly realized after starting there for maybe two, three years is that you're building applications that are used by maybe 20, 30 people inside the bank while you're reading about blog posts or things people are doing that are more modern, more cutting edge using latest version of the .NET runtime and using the libraries that you're not allowed to use and building things now for hundreds of thousands of users or even millions of users.
Yan: That's just when Facebook started to take off. So, I realized I have to just get out of that environment for my own development even though I was pretty sure that I'll have to take a pay cut to do that. One of the things that I also found out pretty quickly when I started looking is that you get typecast. You get boxed into this finance environment. Especially in London in the Canary Wharf area there's so many banks.
Yan: You would just go from one door to the next door, literally, you go from Credit Suisse to JP Morgan to some other bank. So, all I was getting was, "Come join this other bank," which I wasn't interested in. I was looking for something that's going to help advance me from a technical point of view. I was lucky that eventually I got a job with this startup that's doing Facebook games.
Yan: That's where I guess I lucked out that really early in my career, it was 2009, I had the opportunity to join a company that was all in on the cloud, all in on AWS. So, straight away, I got exposed to things that really have helped me for the last 10, 15 years, which is amazing. I was really lucky that my manager at the time took a chance on me, because compared to a lot of the candidates he had, I was probably one of the least experienced, and I had no commercials experience at all at that point.
Yan: From then on just pretty much all AWS. In those early days it's all virtual machines. There's no S3. There's no Lambda. There's not a lot of services that you would come to rely on nowadays. We were using SimpleDB, which I don't know if anyone who's listening knows still exists. You can still get access to it. If your account had it before, then you still have it. Otherwise, you have to send in a support ticket to just enable it in your account.
Yan: But, yeah, no DynamoDB. I think SQS was there. S3 was there. SNS, I think, wasn't even there at the time. So, that's quite bare bone compared to what you have nowadays on AWS. So, from there, I worked there in the games industry for quite a number of years working on Facebook games and then mobile games when everyone just left Facebook to play games on their mobile phones. So, we can just follow the way where the audience is.
Yan: So, from then on... I don't know if you have seen this before where people had a job for a very long time and then once they leave their job, they start to hop every year or 18 months. I went through that phase. I left this job where I was there for six seven, years, and then suddenly I found myself jumping between different industries.
Yan: I was in the social networking for a bit. I was in the sports streaming for a bit. I was in e-commerce for a few months with Just Eat. I don't know. I just went through this phase where it goes to a place. You experience a new industry, but then you got to the point where you want to try something else again. So, I spent a couple of years, went through quite a few different industries and learned quite a bit about how different industries work.
Yan: Some of these interesting challenges I was in DAZN, which is the biggest sports streaming platform. And that's where you learn about some of the really high scale problems and also the challenges of doing live streaming. Because even though we weren't dealing with the actual media streaming itself, you still have to build systems that can sustain the really spiky traffic, because Man United is playing Liverpool, for example, big match, watched by over a million people concurrently.
Yan: Everyone comes in just before the match. Literally, within 30 seconds we go from almost no traffic to tens of thousands of requests per second. So, you have to build your system in a certain way, and you have to think about those kind of things. You have to think about uptime quite extensively. Unlike catalog content that you can download on demand, you can't just cache anything. There's nothing to cache. Everything is coming in live. So, if you've got any kind of outage, then no one can watch the match.
Yan: And the fact that early on in DAZN they had issue with the outage during a big J League match. And there was a pretty significant impact, because when that happened, no one could watch the match. And they almost lost the license for the J League, which they spent like $2 billion to acquire something like that. So, there's a lot of emphasis on uptime and the resilience and scalability. So, that was a really interesting learning experience in terms of how to build systems that can support that kind of workload, that kind of a spiky traffic.
Yan: And, of course, because of that, even though we're all big fans of serverless at DAZN, you can't use Lambda in a lot of the workloads that need to basically deal with that spiky traffic. Because Lambda, even though it's scalable, it's scalable up to a certain point, and once you go beyond that point, then it becomes quite challenging to meet those scalability demands you reach with Lambda.
Yan: So, there's very much a mixed environment with lots of containers, also with Lambda as much as possible for things that are not on that user-facing path where you get that million people joining at the same time. Or you can cache things, cache your API responses. DAZN was a really interesting learning experience. Since then I've gone the similar path to you. I've gone independent as a consultant, which I'm actually enjoying a lot as well.
Yan: The variety of work, the variety of problems and domains people are working in and you trying to step into their issues and help them has been really enlightening. You know how I talked about running through that phase where jumping between different industries? It's almost like that, but on steroids. Now every client you're working with are probably coming from a different industry.
Yan: And they have different contacts and constraints that you have to work with, and you have to apply what you know and try to help them the best way you can. So, I find that has been really, really interesting. You learn a lot from the clients. Even though you are helping them, you are lending them your expertise, but you also learn a lot in the process as well.
Adam: Yeah. Oh, absolutely. That's been one of my favorite parts of... I mean, there's the freedom, and that's great in terms of being independent. But just being able to select clients that you're going to learn things, you're going to branch into things that you really had never been exposed to, whether that's the problem space or you see this particular problem they're dealing with.
Adam: And there's a certain solution maybe in mind that you're branching into technologies you don't have as much exposure to. I love the sort of everything is different every week to week or month to month, taking on shorter term projects, just a whole lot of variety. It's really great.
Yan: And I guess one of the things that's probably also worth mentioning as well is that, certainly in Europe and the UK, there's a ceiling in terms of your earning potentials if you're taking a full-time job or even taking a contract job. It's not like in US where you could be looking at anything from $300,000-$400,000 to maybe $1 million a year.
Yan: Those kind of salaries don't really exist in Europe and UK. Contractors typically earn quite a bit more, but I guess as independent, you have a lot more options available to you. And my earning potentials nowadays is so much higher compared to my last full-time job even though that was a really well-paid job in Europe and UK standards. And also, like I said, I get a lot more freedom to choose what projects I take on and also how much I want to work as well.
Adam: Yeah, absolutely. One of the first tweets I think I saw from you and how you came on my radar, maybe it was a blog post too, but you talked about building out a pretty complicated application for a client using AppSync. And, for me, I had been building with AppSync for a while, and it felt like I didn't see a lot of that.
Adam: I didn't see a lot of people talking about building modern applications with AppSync and VTL, which I want to talk about as well. Do you see that lack of content out there, or is it actually lack of adoption? Is AppSync under the radar, or is it I'm just not looking at the right places?
Yan: Actually, I don't have a lot of data on the AppSync adoption, but from what I do know, the people that I do know, it's quite popular. Maybe there's a lot of adoption coming from not your traditional AWS-centric or backend-centric teams. There's a lot of adoption from more frontend focused teams, teams that are looking more from a full-stack point of view. So, maybe they're not in your normal AWS circle. Certainly, I know a lot of people use AppSync through Amplify.
Yan: And I guess Amplify and AppSync becomes considered as the one frame, because your entry point is Amplify, and you're working with Amplify even though Amplify behind the scenes is creating apps and APIs for you. So, maybe there's a lot of that adoption. It's just not on our radar, because we're dealing with AWS people. We're dealing with people that are building backend systems with AWS. So, we're not really looking at the people that are writing things about how to do things with Amplify even though they are really working with AppSync in that case.
Yan: But, yeah, I do think AppSync, the adoption is maybe not as much as something like API Gateway, which I still see quite much more common. But then again, I think if you're talking about REST versus GraphQL, this is a paradigm shift. They do take more effort, and they're going to take longer to be more mainstream. But I have to say AppSync has been amazing. The fact that I can do so much now. As a one-man team, I've taken on projects that really would have needed a whole team of people to deliver, because with AppSync, I can just do so much.
Yan: If you think about writing a REST endpoint with Lambda or even with containers, it's not hard. You write a bit of handler code that handles some user requests. You've got some middlewares for dealing with cross-cutting concerns, all of that. We all know how to do that. But every endpoint you're dealing with, even if you're just reading something from a database and returning as it is in the right format in JSON, requires to write some custom code.
Yan: It has you configure your Lambda functions, requires you to configure very different things, IAM roles and all of that, which you can just delegate to AppSync and just, "Hey, here's how to make a request, how to translate the request arguments from the request body to the request you're going to send to DynamoDB, for example."
Yan: It just bypasses all of your custom code and Lambda and whatnot. It's more performant. It's more scalable. It's just easier. You can get so much done. I've had cases where I'm building out an entire new feature in just a matter of maybe not even hours, minutes, because it's just a couple of cloud endpoints.
Yan: Copy something that I did before. Look at the structure for the AppSync, the VTL template for DynamoDB. Now [inaudible 00:14:27] in the VTL file. Change the parameters where it's coming from, from the arguments, and that's it. I'm done. Within 30 seconds I've got a new endpoint, and then I do the same thing again for the next thing. So, it just takes no time to actually implement.
Yan: You end up spending a lot more time thinking about, "Okay, are we doing the right thing in terms of the DynamoDB structure? Are we thinking about the fact that we may have a different access pattern later? Should I be using one table for each entity? Should I consolidate several entities into one table?" For the out there, I'm not a big fan of single table design, at least not as a default, as a principal. I do use techniques where it makes sense, but I'm not dogmatic in terms of, "Okay, everything has to be in the same table."
Yan: So, my default is, "Have as many tables as you need, but where it makes sense. Because if you access patterns, then sure. Put multiple things into one table, and you apply similar techniques that people like Alex DeBrie and Rick Houlihan has been writing about and educating people on. I just don't think that you should be doing that all the time, just because."
Yan: Everything we do has a trade-off. And as much as single table design is going to give you some performance improvements and cost improvement, they only matter when you're at a certain scale. Otherwise, you're just paying the complexity cost without getting any of the benefit back. And there's some things that doesn't work as well. For example, DynamoDB streams becomes almost useless when you just get events for everything. You have this built in your own code, and then you've got a limit of five subscribers per stream.
Yan: Well, that's going to be a problem once you've got a more interesting system. They have to react to lots of different events. So, you have to do your own fan-out. One Lambda function to process events in the stream and then fanning out to other places so the other functions can [inaudible 00:16:30] of more than five subscribers. Things like that just become really difficult to deal with when you've got single table design, which I think, again, its complexity cost that would make sense if you're dealing with really high throughput like Amazon.
Yan: Rick Houlihan talked about all this migration project they've done. I mean, you're talking about people that are used hitting DynamoDB for millions of times per second. So, even the tiniest bit of optimization is going to need a lot of money. But that doesn't apply to 99.9% of people out there. So, as much as I love the work that Alex and Rick are doing, I do wish there's more contextual information that goes along with their advice around the single table design.
Yan: There's a strong correlation, not causation, the strong correlation of people that tells me that VTL sucks are hated to people that are using single table design, because you do [crosstalk 00:17:28] more custom code in VTL when using single table design. VTL is not the most popular language out there, even though once you get used to it, it-
Adam: Developer friendly.
Yan: Yeah, exactly. So, a lot of people struggle with VTL, especially when they have to write more custom logic, because the single table design requires to do that. If you are making one query to get multiple data and you have to slice the data out in the response template. Things like that just adds complexity. And when I have to pay extra, I always want to know, "Am I getting the right amount in return?" In this case, I don't think most people would be. But again, I guess that's another discussion for another time maybe with Alex on this show.
Adam: Yeah, that'd be fun. I've seen your comments on Twitter. So, I kind of knew your stance on single table. And I think it's the same stance I was adopting, I think, out of just ignorance. I didn't know better. But I felt dealing with AppSync single table wasn't native to that experience. GraphQL has some inherent properties to how data is retrieved that make it more difficult, I feel, like going down the single table route.
Adam: There's this thing in the back of my mind that, "Oh, single table is best." That's what you feel if you're part of the community that you should be doing. And I loved seeing your messaging on the subject, making me feel better about, "I don't have to do it every time, or it's maybe not even the best default for what I'm doing full-stack development with AppSync."
Yan: Yeah. So, one of the problems that single table design solves is data stitching. So, you can get all the data you need in one query and then stitch them together yourself, making multiple requests and all of that a lot easier. The thing with GraphQL and AppSync is that it does that for you. You just associate the entity with the right resolver at the right table, and then it does all of that for you.
Yan: And it parallels things as much as possible, at least as much as is able to. So, depending on how deep your nesting goes in your schema, then you don't have to worry about, "Okay, make those requests in parallel, and then stitch things together," because AppSync just does that for you. I spoke with Alex about this a while back on my podcast, and he actually said that, "Oh, yeah, using AppSync single table design is probably not the best for you." So, I think even he agrees with that point.
Adam: Going back to something you had said. You mentioned Amplify, and I never really thought about my perspective as a full-stack coming from a full-stack web engineering background. That's probably why I preferred AppSync in the first place and why, for my use cases, I'm building out full applications from a frontend perspective. I don't think of myself as a frontend developer, but I think that's where I'm working.
Adam: And I hadn't really put it together that Amplify does drive the narrative for AppSync. It just abstracts away a lot of AppSync. The data modeling in Amplify hides a lot of what I think is so great about working with AppSync. And I guess that's maybe why there's not so much content. I feel like AppSync-specific content, you've got a lot of stuff out there, and I just haven't seen a whole lot else.
Adam: So, I appreciate all the attention you bring there. One of the things you talk about is VTL. And I'd love to nerd out on VTL, because that's another thing that I feel like there's got to be people out there writing VTL, and I just don't know where they hang out. I don't know if you do. Do you know communities? Is there a discord channel or something that I can join?
Yan: I don't know either. I know a couple of people who love VTL, but the vast majority of people out there just don't like VTL. I think VTL-
Adam: Could you explain... Oh, go ahead.
Yan: Sure, I'll go ahead. So, VTL is this templating language based on Java, and it's been used in the API Gateway as well. So, from working with API Gateway for many years, I'm familiar with VTL already. But if you actually go to VTL's homepage, they have got a quite good reference page for how the basic syntax, the structure for VTL. And a lot of the method you find on objects are all the same as Java.
Yan: So, a lot of the object methods are probably not something that they're familiar with and the syntax as well for conditional branching, the hash in front of the If and End, Else. That's just something that you just have to get used to. It's one of those things that, at the end of the day, it's just a syntax. It doesn't really matter.
Adam: Yeah, because there are benefits. Could you speak to why choose to go through the pain of learning VTL and writing VTL templates versus just a Lambda resolver?
Yan: Yeah. So, from the performance side of things, if you compare VTL to using Lambda resolver, VTL is just executed by AppSync. There's no additional thing that you need to call. So, you don't have to wait for Lambda function to cold start in the case of you do have a cold start, which can add quite a significant amount to your response time. And you don't have to wait for any additional overhead that you have with Lambda.
Yan: For example, if you're doing some other things in your Lambda function to send telemetry data to your own monitoring systems and things like that, which you shouldn't do that in the synchronous path, but if you're doing that as well, then that's going to get added to your response time. So, VTL is just going to be the fastest way to get data into your response.
Yan: And in terms of scalability as well, anytime you invoke a Lambda function, you are consuming the Lambda's concurrent executions, which does a soft limit on the region. But I think most regions is about 3,000. You can raise that. That's not a problem. Problem comes when you're dealing with a really spiky workload where you've got certain spike in traffic. You could hit another limit on Lambda scalability, which is hard limit, which is, once you hit the original threshold, I think for most regions it's about 3,000.
Yan: Then you can only increase your concurrent executions by up to 500 per minute. And that's a hard limit even though you could arguably negotiate with AWS on a case-by-case basis that, "I have got this really specific workload. Can you raise that 500 to a 1,000," or whatever. But, ultimately, you're still dealing with some sort of hard limit that you're going to hit in terms of how quickly you can get to your peak concurrency.
Yan: So, when you're dealing with Lambda functions, you're dealing with that. So, if you're not using Lambda for your request, just using VTL and letting AppSync call the backend resources directly, then you're not dealing with that anymore. So, that's good for scalability and performance, but also in terms of cost.
Yan: If you're having Lambda functions that have to run to call DynamoDB, and let's be honest, your function is just going to be waiting for the response from DynamoDB, be it 5 milliseconds, 10 milliseconds, or maybe 50, then you're paying for all of that time for the Lambda invocation just for CPU to sit idle even though you're paying for every millisecond of it. So, again, there's a cost benefit as well with not using Lambda resolvers.
Yan: Of course, there are cases where you do have to use the Lambda resolvers, or you prefer to use the Lambda resolver, because you've got validation logic that you may find a lot easier to write in the Node.js or Python or whatever compared to VPL, then sure, there's a benefit there in terms of dealing with some business domain complexity. And I guess you introduce a bit more operational complexity in terms of having a Lambda function in there, but there's that return on your investment.
Yan: Otherwise, VTL is going to give you much better performance cost and scalability. One thing that you do get with Lambda though is that, if you're using other tooling to monitor your stack, then you get a bit more visibility, because the Lambda function can be wrapped to emit custom metrics, can emit other telemetry information, whereas if you're using VTL, then you're relying on AppSync's built-in logging to log the information you need to debug things.
Yan: Once you enable the full use of resolver logging in AppSync, it gives you so much stuff which is great for debugging. But also at the same time it's a bit of a cost concern in production when the cloud which logs is really expensive. And if you've got AppSync resolver that returns an array and then you sort of hydrate that array with more nested resolvers, then those logging is going to get crazy.
Adam: Oh, yeah. I've been surprised by a CloudWatch build before because of that.
Yan: Yeah. So, in production I should turn it off. Right now AppSync doesn't offer any sort of sampling for their logging. So, I end up doing some custom work around where I'm running a cron job that disables, enables, that changes the log level for the AppSync API for a certain amount of minutes per hour so that I get some sampling of those debug as detailed log messages. But just not leave it on 24/7, because the cost for doing that can be excruciating, could be many times the way you pay for AppSync than you have to pay for CloudWatch and CloudWatch logs.
Adam: Yeah. So, there's a lot of benefits from a runtime perspective with AppSync. Some of the downsides, I've heard people complain, "The testing picture is not great." But I think, overall, the developer experience with VTL is lacking. And I laughed hearing you explain the process of building out a new feature in AppSync. It's exactly my workflow, which is copy bits of VTL that worked before and tweak them to work in some new feature. Is that something that you think is prohibitive? Is the developer experience for VTL just holding back people from adopting it?
Yan: A little bit, yes. But I also think that there's this nice little feature in the AppSync console that, if you're editing the VTL request and response template in the AppSync console, it gives you IntelliSense over VTL, which is something that people who don't realize it's there, they probably need to surface it a little bit better than the, "Oh, once you go to the schema and then you go to the right resolvers and then you start editing, then you see the VTL editor."
Yan: Maybe that needs to be taken out into a VTL Playground or something like that so that you can get that nice IntelliSense, "You want to write If? Great. Here's the If syntax." And then it shows you the IntelliSense support for the [inaudible 00:29:25] and context objects as well so that you can discover what's there.
Yan: So, it gives you that better developer experience only when you're in the AppSync console and when you're editing a specific VTL template. So, if that exists on its own and somewhere else, then I think it would be a lot better developer experience. Or maybe someone just needs to write a VTL plugin for Visual Studio. Maybe that's what [crosstalk 00:29:53] did really.
Adam: It's funny you say that. Yesterday someone had DMed me just about seeing our episode and where we were going to talk about VTL. And they DMed me and said, "Hey, I want to show you something I've been working on." And I got on a call with this person yesterday and saw this VTL Playground where it's exactly that. You paste a body in a context object and you write your template against it and execute in real time. It was really slick.
Adam: So, I think he was focused on API Gateway first, and I don't know when he's launching this, but he showed me in the console where AppSync has this light IntelliSense built into the console there. That was the first I'd seen it. That was yesterday. I've written so many VTL templates painstakingly and had no idea that existed.
Adam: So, I do think that's an area we'll see continued improvement, that people will build on that experience. And I'm excited to see stuff like that launch, because I'm doing stuff in string literals, in CDK constructs. A lot of times I'm writing VTL templates in a very painful way.
Yan: Yeah. I found myself sometimes that when I need to write more complex VTL code that I wasn't 100% sure, I would just go to the console and start editing there. You can save right in the console as well and then test things out against your apps and API. So, I find that quite useful just to experiment with things, especially when I'm writing code I'm not 100% sure.
Yan: But once I've got the code, once I'm sure that, "Okay, this should work, at least mostly," then I can then start to write my unit test for those VTL templates, which I actually talked about in my AppSync Masterclass course where you can use some of the libraries that they include as part of Amplify, but they also make it open source and available by NPM that you can use those to simulate AppSync's VTL engine so that you can, "Hey, here's my template. Here's the argument for the context object and the [inaudible 00:31:53].
Yan: And then just run it and generate the output request that you're going to get, which is a JSON structure you can check. Does it have the right argument? Is it doing the right thing? You can do that once you've got a solid foundation for your VTL code. If you're writing anything complicated, then you can just capture that and test it in your unit test so that you have some protection against regression going forward. But, yeah, I do use that. I do like the AppSync's VTL editor when I need to experiment with things quickly.
Yan: I think that's still happening. It just hasn't happened yet. They put out the RFC for, I think, late last year or early this year. I can't remember when they did that. But it's something that a lot of people have been asking for. Again, it's going to help lower the barrier of entry for using VTL. Or let's maybe not call it VTL. Let's say letting AppSync talk to your databases directly as opposed to a Lambda function, which I think is the direction that they want to push people towards anyway.
Yan: So, that's usually not a lot of custom code unless, of course, you're dealing with either complexity in the business domain or complexity in how you organize a database. If you're using single table designs and you're fetching multiple entities with one request, then you have to do a bit more data slicing afterwards and in the response template. But even with all that, I think AppSync is still just so much more productive, for me at least, compared to other things I've used in the past, including API Gateway.
Adam: So, you mentioned the AppSync Masterclass. For those that have joined maybe late, we are giving away a premium license. So, just stick around. We'll be doing that here at the end of the show. And I want to talk to you a bit about content creation, because looking at your body of work, and you're doing everything... I mean, you've got workshops for serverless. You've got the AppSync course. You've got, I think, other courses, even a Lambda course maybe. You do open source work. You blog. You have a podcast. You have a YouTube channel. Is that everything?
Yan: I think that's right.
Adam: Yeah. So, how do you manage all of that? Or do you feel like you are primarily a content creator and you do other things to sustain? Or is it just the thing you do, because you enjoy it?
Yan: Just the thing I do, because I enjoy it. People often ask me, "How do you find time to do all that?" I think the fact that you enjoy something, you're just going to make time for it, and it takes time to do a lot of those things. But at the same time, they kind of reinforce each other.
Yan: Because a lot of things that I'm writing and sharing as a blog post after a while, I can take a lot of the same ideas and put it into a coherent course that I can show, I can teach people, I can sell, and then I can maybe even put them together in some structured format like a workshop so that I can teach people as well.
Yan: Because not everyone has time to read the hundred different blog posts I've written in the past, and also not all of them are up-to-date based on all the changes that we see in AWS. So, having some other things that I can then just continuously update and refresh based on new features and new capabilities. And then I can also monetize as well. That really helps.
Yan: The fact that you can monetize a lot of the work that you're doing, that also helps with the motivation. But I think, first and foremost, I just find it good for myself that when I learn something, to be able to write it in a way that explains to other people who may not have gone through the same journey as me. Firstly, why this is important, why they should even think about this, and read about your solution. And then also share the solution with people so that they can then take that and use in their own day-to-day work.
Yan: The fact that you have to go through that process of formulating things and consolidating them and making them coherent and building a coherent story around it so they can share with other people, it helps you learn better as well. It's amazing how many times I thought I understood something until I started to sit down and start writing and realize, "Wait a minute. How do you get from point A to point B?" And then realize there's something in the middle that you are not 100% sure. Then you have to go back and then figure out, "Okay, right. That's how it actually works." It helps you build up a better mental model or more complete mental model of how things actually work.
Yan: Another thing that I find really useful is that through a lot of these things I'm doing a lot of experiments, testing scenarios that are outside of the documentations. It's amazing how many times I've run into situations that are not covered by AWS documentation, which is great by the way. But there's a lot of failure paths, the specific edge cases that, once you start sitting down and thinking about this stuff holistically, you think, "That's how it works with SQS and Lambda. Great. But what will happen if this happens?"
Yan: And then you come up with some specific scenario which is interesting to you maybe from a more academic point of view, because maybe no one runs into this. But maybe someone runs into it, and then they'll be stuck, because no one has written about this before. But anyway, just it's interesting for me to do a lot of experiments, plan them, execute them, and then the shared results afterwards.
Yan: And I think that really helps me learn myself and also build up that more complete mental model of how things actually work. And also, you appreciate some of the challenges that the AWS world have to deal with with all these different systems, how to integrate together seamlessly and smoothly and deal with all these different edge cases.
Adam: You mentioned writing first. Would you say you prefer starting that way? You write your thoughts, and then it turns into other types of content?
Yan: Yeah, I would say so. I prefer writing. I think writing is probably what I enjoy the most in terms of content creation. I love doing video courses and all that and love doing talks as well, but the writing is where you can really get into a lot of the nitty-gritty details, especially on a more technical topic, because you have to share lots of code, and people will actually sit down, and, hopefully, they will read a lot of it, whereas with talks, it's hard to do that.
Yan: And also with video courses, it's just video editing is a pain. It's great. Again, you can do a very deep dive on the technical topics. You can show a lot of code. You can do very detailed analysis on the piece of code, how you're doing it and how things work, but it just takes a lot more effort, whereas with writing, it's a lot simpler.
Yan: Also, with writing, one of the things I find very interesting is that a lot of people have commented on my writing style, and they said good things about it. And I think one of the things that I found really interesting... I've been writing for more than 10 years now... is that there's a few use of basic principles that I tend to follow. It's like sports. There's a few basic things you can do well, you're pretty good already, but it's incredibly hard to do those basic things well consistently.
Yan: And I think it's the same with writing. I see a lot of people start writing, and I've helped mentor a few people. And when they start wanting to write the technical content, you see some common mistakes like, "Oh, I've got this thing I really want to share with people," and then this goes straight into the solution. I don't know if you remember that scene from Wolf of Wall Street where DiCaprio said to the guy that plays punisher, "Sell me this pen."
Yan: He doesn't start telling about the pen, how great it is as a solution. He starts telling you, "Okay, I want to write you a check." So, give me a problem that demand the solution. And then says, "Okay, we don't have a pen. Great. How much do you want to pay for this pen?" Which I think is a good analogy for how you probably should be approaching writing technical content.
Yan: Don't sell the solution until you solved the problem to the reader, because there's good chance that whoever reading your article haven't seen your specific problem, but they're interested in terms of maybe just being curious. So, you have to build a picture of why it is a problem that's worth solving before you present the solution.
Yan: I see a lot of people that start writing. They just go straight to the solution with no context, with no background on what is the problem that you're trying to solve, and also importantly, what are the constraints that you're dealing with? Because you can't really present a solution without also talking about those two things.
Yan: People talk about best practices, but then best practices a lot of times is just someone else's opinion that works for someone in that particular context. So, unless you also mention the context, then you can't really talk about what is the best practice. And that's one of the things that you really appreciate as a consultant where one solution may work great for maybe most of your customers, but then for some customers, they have got specific constraints they're dealing with. That solution just doesn't make sense.
Yan: Maybe they're running at much higher scale so that they do need to use a single table design to get the cost benefit and performance benefits from DynamoDB, but maybe for 99% of your customers, you don't need to do that. So, again, that context, that constraint is very important in solutioning a problem, but it's also important for writing the article that explains what it is that you're doing as well.
Yan: And that's one of the things I keep finding that I'm reading articles, and then I read through the solution and answer of that solution, but I don't understand your problem, and I don't understand why you're not doing some other things, because that just seems to me to be a lot simpler, because the constraint has not been explicitly explained in the article.
Yan: So, that's some of the things that I always try to remind myself is just that, "Okay, before you start going to the solution, because that's the exciting thing you want to write about, talk about the problem first, and then explain the constraint you're dealing with, which is going to help lead to a better understanding of why this solution makes sense for this problem and in this situation that you're dealing with. Because you can skin a cat hundred different ways. There's no right way depending on your conditions that you're dealing with."
Yan: Another thing that took me a long time to appreciate, just don't use words like just or simply, because what seems simple to you or trivial to you, may not be simple or trivial to somebody else who has got a very different set of skills and background. I had to spend quite a lot of mental effort and cognitive effort to get those wording out of my writing so that it reads less condescending almost. I don't know if condescending is the right word, but, yeah, it's something that I've had to learn over time.
Yan: And it affects my writing style, and I tend to read through what I write a couple of times to just remove as many of the unnecessary words and padding as I can. There are articles where you are writing something as an inspiration that you want to encourage more people to adopt a different technique or think about a different problem, then you maybe use different kind of wording. But for a lot of these or mostly technical, "Hey, here's how you do something," posts, I tend to just remove as much of the padding words as I can.
Yan: There's a tool called Hemingway which I use to detect the writing style level. So, I try to make it level eight or nine so that it's as accessible as possible, because maybe a lot of people are reading your posts are not native English speakers. So, again, you want to make that reading more accessible and go with simple sentence structures and use simple words as much as possible. Just don't use words like, "I'll simply do this. Just do that." Because maybe for you just do that, but then for someone who doesn't understand how does some or the other tools work, that actually means something else. They have to then go and learn and read and appreciate.
Adam: Yeah. So, those are some excellent tips in terms of writing, and I know it's something I want to do more. I guess, from your perspective, you've been putting out content a long time. Is there something you would say to an aspiring content creator in 2021, something you've learned over these 10 years of writing or creating other content?
Yan: Yeah. I guess, probably, the hardest thing to do in terms of as a content creator is just to do it consistently. It's easy to go through a blitz and just buy something every day, but then you just really quickly run out of steam. You want to do that consistently. And also, just offer consistent quality as well. Well, actually, you want to improve quality over time, but also have a consistent standard for the full quality of the content you push out.
Yan: And one thing I also want to push, I want to talk about is just that use this as an opportunity for you to learn something. Also, there's no problem with sharing things that other people shared already, and that's absolutely fine. There's always room for more of a personal take or opinion on a certain subject. But I find there's a lot of value for yourself in terms of your own development to do experiments and to figure out how something actually works.
Yan: And I don't know if you're familiar with the Dreyfus model for skills acquisition. It talks about how to go from a novice to competent to eventually become an expert or master. And if you remember from Inception and another DiCaprio film, he always talks about how you don't want to go into... What's that place that, if you die in the... that this is the layer of that...
Adam: Yeah, the dream state.
Yan: The dream state, you go to this other separate state. And he talks about how, "Oh, you don't want to do that, because you're going to get stuck there forever." He talks about these things that you shouldn't do, but then he goes and do it himself. That's kind of what separates someone who is very competent from being a master, because you have a deeper understanding, a deeper mental model of how things actually work.
Yan: And a lot of times you can only reach that level through experimentation, through trying things that are not included in documentation, going off the beaten path. I think that's where you learn an awful lot yourself beyond what everybody else knows as well. And that's also where you can create a lot of value in terms of content, because no one else are writing about this.
Yan: I think Zac wrote a really good post around one of the problems. So, basically, with SQS and with Lambda, the fact that you don't control the polling from SQS to Lambda, you can run into a situation where maybe messages are not being processed, because by the time here's your Lambda function, your Lambda function doesn't have the maximum concurrency so that the request from SQS you can reject it.
Yan: And then the message is going to go back into the queue, and then it's got to get picked up again. And then if it fails again after a number of times, those messages can go straight into the deleted queue. So, it's a very peculiar problem that maybe most people are not going to run into, but if you're that 0.1% of use case, then you're going to hit that, and you're going to find that there's very few or very little information out there on how to address these problems and what you even call this problem.
Yan: So, things that you learn either because you're lucky and you're working in an environment, or maybe you're unlucky, you're working in an environment where you're hitting these edge cases, or it's something that you do just through a mental exercise of, "I wonder what happens based on how this system works." You look at it from a systematic point of view and then figured out, "Okay, how do I break it? And if I can find those edge cases, that's going to maybe break it, then how does it actually work?"
Yan: Then you start to build up that exercise, almost like what people do with chaos engineering in terms of looking at your system and looking at, "Okay, I wonder what the system is going to do if this weird or this edge case happens." And then you try to do those experiments and plan around it, and then actually try and see what your system actually does. So, experimentation is a super, super valuable thing to do for your own personal growth.
Adam: So, I have a question that's totally shifting gears here, but I have to ask it before time's up. Where did the Burning Monk come from? Where did that name originate?
Yan: Yeah. So, I'm a huge fan of Rage Against the Machine. Their debut album is the picture of the burning monk who was a monk in the, I think, '40s, a Vietnamese monk who set himself on fire in protest for the prosecution of Buddhists at the time in Vietnam. He's quite a famous figure, because the fact that he set himself on fire and just sit there still, it's kind of an amazing demonstration of a human mind over body. But anyway, that's the debut album for the Rage Against the Machine, and that was my favorite band growing up. When it came to picking an email that's what I went with.
Adam: So, just really quick. There's some recent news. Do you have any thoughts on any of the whether it's the Graviton Lambda support or the cloud control API, any thoughts on any of the recent stuff that's coming out?
Yan: Yeah, sure. I've got a lot of thoughts around those. I mean, maybe I start with the easier one. The Graviton2 support for Lambda is interesting, because it offers you 10% cheaper price for Lambda compared to X86 even though the performance numbers that Amazon published in their official announcement is that this sounds very good. On their workload they tested they would get 90% better performance at a 20% cheaper price point. Now I think they're saying up to 34% better price-performance ratio.
Yan: But then again, I've seen other benchmarks that put Graviton to be a lot lower compared to X86. So, the thing is, if you're doing any CPU-intensive work, you should be benchmarking against your own workload always. But that being said, maybe like 95% of my Lambda functions are just calling this other API. Now wait for the response. Do something else, and then call some odd API. So, most of that time it's just sitting idle.
Yan: So, if I'm going to be just wasting my Lambda milliseconds, then it'll be better if I do it with 10% discount than without it. So, for a lot of the functions that are just doing I/O stuff, then Graviton just gives you a free 20% discount on your Lambda bill.
Yan: Even though I still maintain that with all these different options for you to get discounts on Lambda, you shouldn't do that until you actually work out which functions are expensive and which ones are not, because even if you're doing a small optimization, it's still just work you got to do unless it's something that you should just change your configuration.
Yan: So, maybe you do that for most of your functions, because it's everything just I/O. So, just switch to Graviton, and you'll be fine, and just run your engine test and whatnot afterwards. So, that one is very interesting. And then there's the step functions now support the service integration with AWS services without having to using the specific integrations to specific services.
Yan: That's something that we've been asking for for a while. Basically, just do what API Gateway does for the service integration. And they've done it now. So, that's a great news again, super powerful thing. This makes the step functions one of my favorite services on AWS a lot more powerful. I love that.
Yan: The Cloud Control API is interesting, because I don't know if Amazon's attempting to clean up the differences or subtle differences in its API so that people that are doing some integration work with the custom frameworks and other things have an easier time not to have to remember, for example, tagging, getting the tags from your resources.
Yan: I found about eight different combinations. I appreciate this more in terms of API calls and the response structure. So, if you're building tools against AWS, that's the certain things that you have to just deal with constantly, and it's annoying. So, maybe this Cloud Control API is the first step towards having more standardized data structure for you to get data from database resources and then also configure them.
Yan: So, this is probably not something that most of us need to deal with. But certainly, if you're building tools, even things CDK or Pulumi or other tools that have to automate them for reaching resources on AWS, something like Cloud Control is going to be a godsend for you.
Adam: Yan, thank you so much. It's been really great, just everything I had hoped for, just getting to sit here and talk with you for an hour. And thanks, everyone, that joined the Twitter space. Next week we'll be on Thursday and Friday. We've got Kesha Williams on Thursday, and then Rafal Wilinski on Friday talking about Dynobase. So, I thank you all for joining live here. And again, thank you so much, Yan. It's been great.
Yan: Thanks, Adam. Thanks for having me. It's been a pleasure.