Dev.Chan64's Blog

Go Home
Show Cover Slide Show Cover Slide

How Do Good Developers Grow Through Evaluation?

gpt-4-turbo has translated this article into English.


Performance evaluation is a sensitive and sometimes exhausting topic for developers.
However, if the desired image of a developer is clearly defined, evaluation can transform from a ranking tool into a method for growth and cultural training.

This is a discussion on how to define the criteria for a good developer, operate performance evaluation as a culture, and ultimately create a structure that accumulates reliable outputs in the team.


1. Why Should Evaluation Be for Culture?

Evaluating a developer’s performance is always challenging.
Just because someone created features quickly doesn’t mean they did well, and writing a lot of code doesn’t necessarily result in valuable outcomes.

Evaluation methods that only consider performance numerically can shrink the team, induce comparisons among developers, and strengthen competition rather than collaboration.

Instead of this approach, we first define “What do we mean by a good developer?” and based on these criteria, we aim to create a structure where the team can train repeatedly and grow.

The purpose of evaluation is not to rank but to establish standards that allow the team to grow together.

And these criteria will lead to outputs documented in code and papers, ultimately becoming the culture of the team itself.


2. Our Definition of a Good Developer

The criteria for judging a good developer are not fixed.
Technological stacks, projects, and teams vary.

Therefore, the first task is to define ‘What is a good developer?’ as a team.

Instead of seeking a universal criterion that applies to all situations, we gradually refine who our team wants to work with based on our own criteria.

This definition is not a checklist but a collection of philosophies and habits that the team develops as it grows.

Currently, the important criteria are as follows:

🧼 Expiry-Ready Code

Code should not be written to stay forever but structured to be deletable at any time.

We value code that won’t collapse the entire system if deleted more than just well-created functionalities.
Ideal code has low dependencies, segregated duties, and can adapt flexibly to changes in the domain.

To write Expiry-Ready code, consider the following:

Being Expiry-Ready is a key indicator of maintainability and scalability, and we train this structural flexibility as part of our team culture.

🧪 Mock-Up Focused Thinking

Design starts with imagining how something will be used before it is fully implemented.

Before coding, we value the habit of first drafting mock-up code.
Imagining how functions will be called, what results are expected, and in what context features will be used can lead us to design responsibilities and boundaries much more clearly.

This approach provides the following benefits:

Don’t write code first; draw how it will be used first. We call this ‘Mock-Up Focused Thinking’.

🔗 Clear Call Interfaces

When calling a function or method, there shouldn’t be a question like, “What does this code do?”

A good interface simplifies context rather than hiding code complexity and is key to smooth collaboration.

We consider the following important:

A clear call interface is the starting point for collaboration.

🧠 Domain-Informed Naming

Names are the first manual read in the code.

We believe that names of variables, instances, functions, and classes should explain their roles.

We adhere to two principles:

  1. Name with at least two words

    • Use domain-revealing names like userProfile, paymentResult instead of generic names like data, obj.
  2. IDEs are our tools.

    • Long names are not an issue in an environment with autocompletion.
    • They rather facilitate searching, tracking, and collaboration.

Good naming reveals understanding of the domain and is the first step in organizing team language.

📝 Documentation Considers the Reader

Documentation is not just explanation; it’s consideration.

We value naturally conveying the reasons behind decisions, choices made, and problems solved through structure and naming over just comments.

We prioritize the following over comments:

If the code itself reads well, it is the best documentation.

📆 Readability with Maintenance in Mind

Good code is evaluated not now, but when it is maintained later.

It’s crucial whether the code can be modified by someone else in six months or a year rather than today.
We see readability not just as style but as a standard of sustainability and collaboration.

Features of good code:

“Can another developer understand and modify this code at first glance?”
If the answer is yes, it’s a success.

🍝 Complex Code Requires Effort to Explain

Complex code itself is not a problem.
The problem is leaving it unexplained.

Anyone can write spaghetti-like code while quickly responding or dealing with complex domains.
We don’t forbid such code.

However, if that code remains within the team, the effort to explain the reasoning and decisions at that moment is absolutely necessary.

The important thing is not perfect documentation.

What the developer left at that time

Any ‘attempt to explain’ itself represents that person’s willingness to grow and a considerate attitude toward the team.

We train to leave explanations anywhere, like PR descriptions, wikis, READMEs, comments, and make it a team culture.

Spaghetti code with explanations is a technical judgment,
while spaghetti code without explanations is a technical debt.


3. What We Leave Behind Matters More Than Quantity

When performance starts to be judged by numbers, developers focus on quickly creating many features or writing more lines of code.
However, we know.

Doing a lot doesn’t mean doing well. What really matters is “What did you leave behind?”

The simplest question to judge whether this code is good is this:
“Can a new person come in, understand this code, and participate?”

If the answer to this question is “yes,” then this code is not just a functional code but a structure where knowledge survives, and it is an output containing context and intent, not just quantity.


We do not base on:

Instead, we base on:

When these standards are repeated, the evaluation becomes the language of the team.

And we say this:

If there is unexplained spaghetti code left, that person probably hasn’t yet understood what our team means by ‘a good developer’.

Because the team has a culture that speaks through outputs, not results.

We think this:

Regardless of the project, information must remain documented so that onboarding is possible anytime for anyone.

The mentioned documents don’t necessarily have to be text.

All these executable tools are onboarding documents, a sign of the team’s consideration, and a trace of culture.

And if nothing remains,

That person did not understand the culture of the team.

We are not a team that just shares knowledge.
We are a team that makes knowledge survive.


4. Evaluation Structure: A Trainable 3-Step Process

We designed the evaluation not as a judgment on results but as a training process to internalize team culture and foster growth.

Evaluation Phases Summary

Phase Actor Purpose Key Question
Phase 1 Self-Evaluation Reflect on alignment with standards “Am I working in this way?”
Phase 2 Peer Evaluation Empathy and feedback on team standards “Are my colleagues working in this way?”
Phase 3 Leader Judgment Assess domain relevance and practical contribution “Did this code align with actual business needs?”

🪞 Self-Evaluation – The Start of Internalization

🤝 Peer Evaluation – A Time for Empathy and Confirmation

Peer evaluation should not be formally included in evaluations.
This process is a tool to check the degree of cultural internalization.

Peer evaluation is a tool for training and empathy, not an official criterion.

To reduce misunderstandings, it might be conducted after the evaluation announcement

This process is used only to check the degree of cultural understanding and standard internalization.
It is not a tool for comparison or ranking.

If not clearly understood, peer evaluation can harm rather than build trust.
Thus, we educate all team members that “peer evaluation is a tool for empathy and training” and we train repeatedly.

🎯 Leader Judgment – The Interpreter of Business Relevance

Through these three phases,
we believe that qualitative standards can be trained repeatedly without quantitative comparisons.

Performance should be revealed not by results,
but by understanding and executing standards within the process.


5. Tools and Structures Used

No matter how good the evaluation criteria are,
without a structure to record, repeat, and reflect on these standards, it’s difficult to establish them as culture.

We operate the team so that standards are naturally recorded, shared, and accumulated through the following tools and habits.

🛠️ We Do Not Conduct PR Reviews

We do not follow the traditional PR cycle of creating features and then receiving approval.
We operate a standard checking process for refactoring.

This process is not a procedure to check code consistency.
It’s a structured dialogue time to read together the standards of a good developer defined by the team and train those standards in the code.

These standards are checked through the following questions:

We call this structure “Refactoring Day” and execute it like a PR at the end of the sprint schedule.

We do not conduct PR reviews.
We operate a process for refactoring.
However, we use Git’s PR feature as a means.

After a Refactoring Day,

These records are not just work histories but intellectual assets containing the team’s thought process, domain judgments, and technical context.

In this structure, there is no longer an ‘approver’.
It’s not important who approves and who gives feedback; the process itself of everyone checking the code against the standards and recalling those standards becomes the team culture.

As self-review becomes possible, the standards become not just external feedback, but internalized habits.

This repetition ultimately helps the entire team to become a trained group that can create good code without needing approval.

📘 Knowledge Base (Wiki, Notion, etc.)

🪞 Self-Evaluation & Retrospective Template

🧠 Leader’s Judgment Sheet (Qualitative Evaluation)

We do not view these tools as evaluation tools.
We see them as structures that allow the team to repeat and train the way they work.

6. Conclusion: Designing Accumulation Rather than Evaluation

We do not want to rank people through evaluations.

We aim to agree on common standards for what makes a good developer and based on these standards, create a structure that allows the team to grow independently.

Performance should be verified not by numerical results but by outputs containing judgment intents and domain understanding.
And these outputs should remain as traces accumulated in the team through PRs, documentation, reviews, and retrospectives.

This is how we work

A good team does not need evaluations.
Within a structure where standards accumulate into culture, the team naturally produces good developers.

This is why we designed evaluations, how we operate the team, and how the culture functions.


Go Home
Tags: Organizational Culture Leadership Design Philosophy Onboarding